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ABSTRACT 

Time series models with autoregressive, moving average and mixed 
autoregressive~moving average correlation structure and with symmetric, 
heavy-tailed, non-normal marginal distributions, called ?, -Laplace, are 
considered . 

First, a flexible mixed model NLARMA(p,q) with Laplace (double 
exponential) marginals is investigated. The correlation structure for 
several special cases is derived. The innovation sequence for the 
second-order autoregressive case, NLAR(2), is derived. Parameter 
estimation in the NLAR(1) models is discussed in terms of moments, least 
squares and maximum likelihood. 

Second, a family of continuous random coefficient models with 
i-Laplace distributions are examined. The H-Laplace distribution is 
described along with a useful transformation. The correlation structure 
for special cases is derived. For a special case when 1 is one, the 
BELAR(I) model with Laplace marginals, the maximum likelihood estimator 
of serial correlation is derived. Least squares estimates are also 
derived using the concept of a linear residual. These estimators of 
correlation, along with other estimators of location and scale are 
compared in a small simulation study. 

Thirdly, the NLAR(1) and the BELAR(I) processes are compared using 
higher order residual analyses based on the uncorr elated , but dependent 
linear residuals, 

Finally, open problems, as well as possible extensions and 
applications of the analyses given in this thesis are discussed. 
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I. INTRODUCTION 



In standard time series analysis, one assumes. the marginal 

distributions of {X } are Normal, i.e. Gaussian. However, a Gaussian 

n 

distribution will not always be appropriate. In earlier works by Gaver 
and Lewis [Ref. 1]; Jacobs and Lewis [Refs. 2,3]; and Lawrance and Lewis 
[Refs. ^,5,6], stationary non-Gaussian time series models were developed 
for variables with positive and highly skewed marginal distributions. 

There still remain other situations for which Gaussian marginals are 
inappropriate, i.e. the marginal time series variable being modelled, 
although not skewed or inherently positive valued, has a large kurtosis 
or long-tailed distribution. The position errors in a large navigation 
system have such a distribution. In particular, Hsu [Ref. 7] modelled 
pooled position errors using the double exponential distribution (also 
called the Laplace distribution). Also McGill [Ref. 8] showed that the 
Laplace distribution provides a characterization of the error in a 
timing device under periodic excitation. Speech-waves are modelled 
using Laplace variables (Davenport [Ref. 9]). In the "speech-like" 
process given by the linear AR(1) model 



X = cX + (1 - , 

n n-1 n 



( 1 . 1 ) 



where .Q < c i .9, the innovation sequence {E^} is i.i.d. Laplace (Linde 
and Gray [Ref. 10]). In image coding systems using a two-dimensional 
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discrete cosine, DC, transform, Reininger and Gibson [Ref. 11] showed 
that the Laplace distribution gives the best approximation to the 
distribution of the non-DC coefficients. Recently Sethia and Anderson 
[Ref. 12] required a stationary aut or egr essi ve process with Laplace 
marginals in their research in communications technology. 

Even before Gaver and Lewis [Ref. 1] wrote the pioneering paper on 
the subject of aut or egr essi ve processes with a specified non-Normal 
marginal distribution, Gastwirth and Wolff [Ref. 13] had derived a 
solution to the linear additive first-order difference equation 

for which ( } is marginally Laplace. This result was used later by 
Gastwirth and Rubin [Ref. 14] within the context of robust estimation on 
dependent data. This solution to (1.2) is here called the Laplace 
First-order Autor egr ess i ve Process (LAR(1)). The early solution of 
(1.2) is mentioned at this point, merely to further substantiate the 
claim that non-Normal, heavy-tailed distributions are of interest. 

In this thesis, several time series models with a specified 
symmetric, heavy-tailed marginal distribution are presented. This 
distribution, called the Ji-Laplace distribution, includes the Laplace 
distribution as a special case. The approach in Chapter II extends the 
discrete random coefficient model of Lawrance and Lewis [Ref. 6], New 
Exponential Autor egr essi ve Moving Average--NEARMA( p ,q ) , to the case 
where the marginal distribution is Laplace, also called double 
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Exponential. This class of models is called The New Laplace 
Autoregressive Moving Average model, NLARMA(p,q). Several special cases 
of NLARMA(p,q) are individually researched. The second-order 
autoregressive model, NLAR(2), is established by showing the conditions 
for existence and uniqueness and by specifying the innovation structure. 
The correlation structure of NLAR(2) is also given along with results 
concerning directional moments and partial time reversibility. 

For the case when p = 1 and q = 0, called NLAR(1 ) , the distribution 
of the difference is derived, providing some insight into the 

nature of the differenced NLAR(l) model. The conditional density of X 

n 

given is also derived, which leads to a brief investigation of the 

likelihood function. Parameter estimation in NLAR(l), however, is 

limited to comparisons of the moment estimators and the least squares 

estimators for the independent model parameters of serial correlation. 

The correlation structure is derived for other models in the 

NLARMA(p,q) family: the first-order moving average called NLMA(1); the 

first-order mixed model called NLARMA(1,1); and the special cases of 
t h 

p -order autoregressive models called TLAR(p) that are analogous to the 
TEAR(p) model of Lawrance and Lewis [Ref. 6]. These models demonstrate 
the flexibility of the NLARMA(p,q) family. 

In Chapter III, a family of stationary time series is developed 
using continuous random coefficients in the additive difference equation 
model. The marginal distribution is specified to be a member of the so- 
called i-Laplace distributions, the properties of which are described at 
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the beginning of the chapter. The "square-root Beta-Laplace" transform 
is defined. It is used to formulate the i-Laplace time series models. 

For the special case when i = 1 , the marginal distribution is again 
Laplace. The autoregressive model is called the Beta-Laplace First- 
Order Autoregressive model, BELAR(I). The conditional density of 
given is derived. This leads to the derivation of a likelihood 

function and a numerical technique to evaluate and maximize the 
likelihood function with respect to the model parameter for serial 
correlation . 

Several facets of the parameter estimation problem are investigated 
for BELAR(l). The behavior of different estimators of scale and 
location are compared using the Simulation Testbed (SIMTBED) of Lewis, 
Orav and Uribe [Ref. 15]. The least squares estimation theory is 
derived around the concept of a linearized residual. Asymptotic 
properties are derived using results from Nicholls and Quinn [Ref. l6]. 
Robust estimators are defined and simulated in SIMTBED. Finally, a 
numerical scheme for finding the maximum likelihood estimator of serial 
correlation is used in a small simulation study of the small sample 
properties of the maximum likelihood estimator. 

In the last section of Chapter III, a first-order moving average 

t h 

model is discussed. A q -order moving average model in Sl-Laplace 
variables is also derived. 

The random coefficient approaches are not the only ways to generate 
Laplace or other variables with a specified correlation structure. The 
literature contains numerous articles on generation of random sequences. 
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One approach put forth in several papers (Gujar and Kavanagh [Ref. 17]; 
Haddad and Valisalo [Ref. 18]; Li and Hammond [Ref. 19]; Liu and Munson 
[Ref. 20]; Sondhi [Ref. 21]) involves passing white Gaussian noise 
through a linear filter followed by a zero memory nonlinear transform. 
This is a general procedure that produces exactly the required marginal 
distribution and a good approximation to the autocorrelation structure. 
However, the scheme lacks the simplicity of either of the methods being 
proposed. Moreover, the filtering approach produces, for example, in 
the first-order autoregressive case, only one process. 

It is important to note that in non-Normal time series, there are 
infinitely many processes with a given marginal and autocorrelation 
structure. This is the case, for example, in the two-parameter NLAR(1) 
process. The differences in these processes must be explored through 
higher joint moments. In Chapter IV, residual analyses using fourth 
joint moments are derived. The ideas are modifications of those from 
Lawrance and Lewis [Refs. 6, 22], who accomplished an analysis using 
joint third moments within the NEAR framework. The residual analysis is 
applied to show the differences in the various NLAR(1) processes and the 
BELAR(I) process. 

In Chapter V, open problems and possible extensions of the analyses 
given in this thesis are discussed. Possible applications to the 
analysis of wind velocity data are detailed. 
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II. DISCRETE RANDOM COEFFICIENT MODELS WITH LAPLACE MARGINALS 



A. INTRODUCTION 

Two aspects of modelling with dependent random variables are widely 
studied — the marginal distribution and the correlation structure. It is 
widely known how to generate sequences with either a specified marginal 
distribution or a particular correlation structure. Transforming the 
random variables may have an undesirable and unknown effect on the 
correlation structure. Likewise, the marginal distribution of a 
filtered process may be unknown. 

It is the generation of random variables with both a specified 
marginal and a specified correlation structure that is discussed in this 
chapter. Specifically, we want sequences with a Laplace (double 
Exponential) marginal distribution and with ARMA(p,q) correlation 
structures as given by Box and Jenkins [Ref. 23] for the usual linear 
ARMA(p,q) models. 

The following is an example of a process that has Laplace marginals. 

Let {X } be a binary Markov chain with transition matrix P, so that 
n 

■ “!• PCV'l^n-l'O ' ““ 

X 

P[X =0lX =1] = 1-a„. Let L„ = (-1) *^E , where {E } is an i.i.d. 
n ' n-1 ■’ 2 n n n 

Exponential sequence. If a^=a 2 = ot. ^ ^ ^ Laplace marginal 

distribution. However, the correlation structure is not that of an 
AR(1) process. It is, in fact, easy to see that Corr ( L^ , L^_,^) = 
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(1 /2) (2o-1 ) ^ , for k=±1,±2,..., which is not a pure geometric function 
of k. 

Two processes which produce an AR( 1 ) correlation structure and a 
Laplace marginal distribution are the Laplace Discrete AR(1), LDAR(1), 
which is an adaption of the DAR(1) process of Jacobs and Lewis [Ref. 2], 
and the linear process of Gastwirth and Wolff [Ref. 13]. called the 
LAR( 1) process. The LDAR(1) model produces an {X^} sequence using the 
first-order autoregressive equation with random coefficients 



= V X , 
n n-1 



n n 



(II. A. 1) 



where {V^} is an i.i.d. sequence of Bernoulli random variables with 
P{Vn=1} = 1-P{V^=0} = p; {L^} is an i.i.d. sequence of Laplace random 
variables. The coefficient and innovation processes from time n are 
assumed to be independent of X^_^ ,X^_ 2 , . . . . This sequence produces runs 
of constant value when successive realizations for produces the value 
1. When is zero, a new value is selected. Although LDAR( 1 ) is of 
limited value in general application because of this runs property, it 
is significant in that it is one of the first in a series of more 
general discrete random coefficient equation models for non-Normal time 
series, and it produces a first-order autoregressive Markovian process 
for any specified marginal distribution. 

The LAR(1) model turns out to be a special case of the more general 
process called the New Laplace Autoregr essi ve Moving Average model, 
NLARMA(p,q). Properties of the LAR(1) process are pointed out in the 
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next section of this chapter, which gives a charact er i zat i on of the 
Laplace distribution. 

The NLARMA(p,q) model is a very useful family of time series models 
that are discrete random coefficient linear difference equations. The 
models are extensions of the NEARMA(p,q) structure of Lawrance and Lewis 
[Refs. 4,5,6] to those cases where the underlying marginal distribution 
is Laplace rather than Exponential. The family provides great 
flexibility to systems modelling, because of the broad range of 
correlations and different dependency structures which are obtainable. 

Section C is an examination of the second-order autoregressive model 
of the family, NLAR(2), for p = 2 and q = 0 in NLARMA(p,q). Conditions 
for the existence and uniqueness of the strictly stationary NLAR(2) 
model are derived using results from Nicholls and Quinn [Ref. 16] about 
Random Coefficient Autoregr essive models of order k, RCA(k). In a 
proof, very similar to that given by Lawrance and Lewis for the NEAR(2) 
model [Ref. 6], the innovation for the NLAR(2) model is derived 
explicitly. The innovation is shown to be a convex combination of 
scaled Laplace variables. The correlation structure in the NLAR(2) 
model is shown to satisfy the Yule-Walker type equations just as do the 
linear AR(2) models. Aspects of directionality and time r eversi bil ity 
are also addressed. 

In Section D, the first-order autoregressive model, NLAR(1), is 
described. It is a two-par ameter , first-order Markov process which is a 
special case of the NLAR(2) model. The distribution of differences is 
derived. The conditional density of given and the likelihood 
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function are also derived. The non-differentiability of the likelihood 



function for all values of the two parameters has prevented the 

development of the maximum likelihood estimators. Parameter estimation 

is discussed within the context of moment estimators and least squares, 

using the usual linearized residual. 

In Section E, several different special cases of NLARMA(p,q) are 

formulated and briefly discussed. The correlation structure for a 

first-order moving average model, NLMA( 1 ) , and a mixed aut or egr ess i ve 

moving average model, NLARMA(1,1) are given. Correlation structure is 

t h 

derived and parameter estimation is discussed for the general p -order 
autoregressive models, TLAR(p), which are special cases of the NLAR(p). 

Each of these models in Section E could well be the basis for 
further research. The intent at this point is primarily to further 
substantiate the claim of wide versatility and tractability in modelling 
non-Normal time series within the context of the NLARMA(p,q) family. 

For example, the bivariate AR( 1 ) process with Exponential marginal 
distributions of Dewald and Lewis [Ref. 24], can be extended to the case 
where the marginal distribution is Laplace. This, however, is not 
discussed further in this thesis. 

B. CHARACTERIZATION OF THE LAPLACE DISTRIBUTION 
1 . Properties of the Laplace Distribution 

The Laplace distribution is also known as the double Exponential 
distribution. In general, the density of a Laplace distributed 
variable, L, has two parameters — a location parameter -» < p < +«, and a 
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scale parameter X > 0. The parameter u is fixed here at zero. For 
-oo < X < » we have 



^ exp(- |x|/X). (II. B. 1.1) 

In what follows we will define {L } as a sequence of i.i.d. 

n 

random variables of the Laplace distribution with X = 1 (Standard 
Laplace). The characteristic function of the standard Laplace variable 
is 






1 

1 + 0 )="’ 



-® < ( 1 ) < 00 , 



(II. B. 1 .2) 



and we have 



e(l") 



0 if n is odd, 

n! if n is even, 



(II. B. 1.3) 



so that E(L) = 0, Var(L) = 2, skewness is zero, and kurtosis is 3. The 
value of the kurtosis indicates that the symmetric Laplace distribution 
has heavier tails than the normal distribution, for which the kurtosis 
is 0. 

The sum of n ^ 2 i.i.d. standard Laplace variables can be 
written as the difference of two i.i.d. random variables Y^ with 

Gamma distribution, shape parameter k = n and scale parameter X = 1. 
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This follows immediately from the characteristic function. Let 



n 

Y = y L. ; then 

iil ^ 






1 + 



+ id) 



1 - iu) 



(j>Y (lO) <Py (-U)) . 
1 2 



(II. B. 1.4) 



This result is quickly generalized. Replacing n by t > 0, we see that 
C(j), (dj)]^ is the characteristic function for the variable X = Y. - Y„ 

Li I ^ 

where Y^ - Gamma(t,1), i = 1,2 and Y^ and Y^ are independent. This 
demonstrates that the Laplace distribution is infinitely divisible. 

Another useful result is obtained from (II. B. 1.4) when n = 1. 
It shows that a Laplace variable is the difference of two i.i.d. 
exponential (X = 1) variables. This makes it quite simple to generate 
Laplace distributed variates in computer simulations. 

Random variables with a standard Laplace distribution are self- 
decomposable. Let 



(p (u>) = <j), (dj)/<J), (pdi) , 0 ^ p < 1. (II. B. 1.5) 

C Li Li 

According to Feller [Ref. 25: p. 588], if <|)^(d)) is the transform of a 

random variable for each 0 ^ p < 1 , then L is said to be self- 
decomposable. But for -“ < d) < ® 

(}i^(dj) = {1 + (pd))^}(1 + d)^) ^ , 

= {p + (1 - p)(1 - id))'^{p + (1 - p)(1 + id))"^} (II. B. 1.6) 
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+ (1 - p")(1 + 



( II. B. 1.7) 



We recognize (II. B. 1.6) as the product of the character! Stic functions 
of two i.i.d. innovation variables, and ~^ 2 ’ described in the 
EAR(1) process in [Ref. 1]. Also from (II. B. 1.7) 



e 



0 w.p . 

L w.p . 




(II. B. 1.8) 



2. The Laplace First-Order Autoregressive Process, LAR(1) 

The i.i.d. sequence with distribution given in (II. B. 1.8) 

is the innovation process of a first-order linear autoregressive 
equation 



X = pX + e , (II.B.2. 1) 

n ^ n-1 n 

where {X } is a stationary time series with double exponential marginal 
n 

distribution, |p|<l. This is the LAR(l) model. It is actually a 
rediscovery in light of the fact that Gastwirth and Wolff [Ref. 13] had 
derived it earlier; also, Gastwirth and Rubin [Ref. 14] discuss it 
within the context of robust estimation techniques. The present account 
of LAR(1) includes new results. 

The LAR(l) model has the same properties as the EAR(l) model in 
[Ref. 1] with two important differences. First, if -1 < p < 0, negative 
serial correlations for odd lags are obtained. Secondly, it is 
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partially time reversible in the sense that for all S. and n, both of the 
following are true; 



E(X-X^.) - - 0, 



n n+J. 



(II.B.2.2) 



P(X S X J = P(X < X , ) = 1 /2. 
n n-1 n n-1 



(II.B.2.3) 



These results are derived in Section II. C and Section II. D. Note, 
however, that since LAR(l) is a linear AR( 1) model with non-Gaussian 
innovation { ) f it is not fully time reversible (Weiss [Ref. 26]). 
Also, note that this LAR(1) model has the zero defect property; when 
= 0 , then = P P can be determined exactly in long enough 

runs of the series This property is generally undesirable, but 

the broader NLAR(2) model developed in the next section is free of this 
defect, except for the special parameter values for which it reduces to 
the LAR(l) model. 

If no repeats are observed in a realization of the time series, 
an extremely efficient estimator of p for LAR(l) is the median of the 
ratio X./X. ,. The simulation results given in Table II. B. 2.1 

substantiate this claim. In Section II. D. 4 and again in III.E.5, using 
the framework of the Simulation Testbed (SIMTBED) [Ref. 15], we will see 
that this median ratio is for small samples very biased, and is, 
apparently, asymptotically biased in all of the random coefficient AR( 1 ) 
models with a Laplace marginal distribution that we examine. 
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TABLE II. B. 2.1 



Simulation Results using Median {X./X. .} to Estimate 

11-1 

p in the LAR(1) Process for Samples of Size 2000 



True p 




p = med {X^/X._^} 


Comments 


-.9 




-.9 


-.9 occurred 1586 times 
in 1999 ratios 


-.2 




-.2 


-.2 occurred 75 times 
in 1999 ratios 


-.1 




-.087^*6 


-.1 occurred 11 times 
in 1999 ratios 


+ .01 




+ .01 986 


+.01 never occurred in 
1999 ratios 


+ .5 




+ .5 


+.5 occurred ^90 times 
in 1999 ratios 


+ .75 




+ .75 


+.75 occurred 1 1 ^9 

times in 1999 ratios 


SECOND ORDER AUTOREGRESSIVE LAPLACE TIME 
Introduction 


SERIES MODEL, NLAR(2) 


Using the 


terminology from [Ref. 6] 


the following time series 



model called NLAR(2), New Laplace Second-order Autoregressive model is 
proposed. This is a special case of NLARMA(p,q) model with p = 2, 
q = 0. The NLAR(2) model has four parameters, double exponential 
marginal distribution for second-order autoregressive Markov 

dependence, and autocorrelations satisfying Yule-Walker type equations. 
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The stationary NLAR(2) model has the same form as the stationary 
NEAR(2) model in [Ref. 6], Writing the time series {X^} in the form of 
an additive, linear, random coefficient autoregressive difference 
equation, we have for all n that 



X 

n 



e, K' X + 
1 n n-1 



K" X^ _ + £ , 
2 n n-2 n 



(II.C.1 .1) 



where {K' , K"} is a sequence of i.i.d. discrete bivariate random 
n n 

variables with distribution 



{K’ , K"} 
n n 



(1,0) W.p.Ql^, 

(0,1) w.p. ot^ f n~0,i1,i2, ...j 

(0,0) w.p. 1- (II.C.1. 2) 



le^} is an i.i.d. innovation sequence whose distribution is given in 
(II.C.2.M); and and t are mutually independent and 

independent of X^_ ^ ,X^_^ , . . . . The parameter space is defined by 
0 ^ I 8^ I ^ 1 and 0 ^ ^ 1 , i = 1,2; = 1* Graphs of the 

admissible regions in the parameter space and the correlation space are 
presented in Section II. C. 3- 

Equations (II. C. 1.1) and (II.C.1. 2) have a direct physical 
interpretation. The observed process at time n, X^, is only one of 
three possibilities: i) X^ is some multiple of what it was at time n-1, 

g^X^_^ , plus some random noise e^; ii) X^ is some multiple (possibly 
different than 6^) , of its value at time n-2, some 
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independent random noise; iii) is just random noise, independent 
of everything up to time n. 

2. Existence and Uniqueness 

The work of Nicholls and Quinn [Ref. 16 ] on random coefficient 
autoregressive models is relevant to the NLAR(2) process. They have 
given the necessary and sufficient conditions for the existence of the 
unique covariance stationary solution to the following class of 
univariate random coefficient autoregressive models of order p, RCA(p): 

"n ■ J, '''i * V“>Vl ' (II.C.2.1) 

n = 0, ± 1 , ±2 where 

a. the Y. 's are real constants; 

1 

b. ^ p-vector , second-order stationary, independent process 

with E(B ) = 0 and constant covariance matrix; 

-n 

c. is a scalar, second-order stationary, independent process, 

independent of with E(e^) = for all n. 

They also have shown that if {B } and {e } are i.i.d. processes, 

-n n 

then the solution [Z } is strictly stationary and ergodic. 

n 

Let Y. = a. 6. for i = 1,2 and B (1) = 6,(K’ - aj and B (2) = 
111 n Ini n 

6-(K" - a„) . Then (II. C. 1.1) and (II. C. 2.1) have the same form. That 
2 n 2 

is (II. C. 1.1) is an RCA(2) model if the innovation of NLAR(2) satisfies 

condition (iii) above. Thus applying the results in [Ref. 16: p.3l and 

p.37], there exists a uni que strictly stationary and ergodic solution to 

(II. C. 2.1) for Y. and B (i) as defined above, if and only if all of the 
1 n 
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roots of the characteristic equation 



(t^ - - a^BpCt^ - = 0, 



(II.C.2.2) 



are within the unit circle, i.e. iff a^B^ + *^ 2^2 satisfied 

for the conditions on the parameters defining NLAR(2), thus establishing 
the existence of the model (II.C.1.1). 



R CA ( p ) models in [Ref. 16]. It is, in fact, determined by the 

independent choices of the innovation and the random coefficients. 

However, by specifying the marginal distribution and the random 

coefficients, in NLAR(2) the innovation is restricted more than in the 

RCA(p) model. If the X in (II.C.1.1) or Z in (II. C. 2.1) have a 

n n 

standard Laplace marginal distribution, then all their moments are given 
by (II. B. 1.3). From ( II.C . 1 . 1 ) or ( II. C . 2. 1 ), i t follows that for all 



No marginal distribution is ascribed to solutions of the general 



P = 1.2, ... 

£(6^*^) = {( 2 k)!} [1 - 

n 11 2 2 




2(k-i) 



)E(e^^'^ ^^/{(2i)!}}] > 0, 



(II.C.2.2) 



and for this to be true it is necessary that 




(II.C. 2. 3) 
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Since and are probabilities it is necessary that |6j^| i 1 for 
i = 1,2 for (II.C. 2 . 3 ) to hold. If not, there exists for every > 0 

Om Om 

and > 0 an integer m, such that or “ 2^2 greater than 1. 

We have now established the necessary conditions on the 
innovation and on and — namely that | B J S 1, i - 1,2 — for 

the existence of a unique strictly stationary solution to (II. C. 2.1) 
with a marginal Laplace distribution and with the random coefficients 
given by (II. C. 1.2). In the next theorem, we show that | B^ | ^ 1 for 
i = 1,2 is also a sufficient condition and that such an innovation 
random variable exists. We also give its explicit form--a convex 
combination of Laplace random variables. For simplicity, the parameter 
space is regarded as being described by strict inequalities for 
THEOREM 1. Let be a stationary process with standard Laplace 

marginal distribution. For all n, let equations (II. C. 1.1) and 
(II. C. 1.2) hold with 0 < | 6 J < 1 , 0 < < 1 for i = 1,2 and 

^ • Then 



e 

n 



K 



n 



L 

n 



L w.p . 
n ^ 



•P2-P3. 




L 

n 



w.p . 



P 



2 ’ 



|b 3 |Ln w.p. p^. 



(II.C.2.4) 



where {L } are i.i.d. standard Laplace variates; the K 's have values in 
n n 

{1 , Ib^l. and are independent of {L^} and {K^, K|^} for all n. 

They are also independent of *^^- 2 ’ * ' ' ' Furthermore, 
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?2 = + a2B|)b| - 



(II.C.2.5) 



(a^ + B|}/(b| - b^)(1 - b|) , 

= {(a^ + 02 ) 6f 6^ - (a^Bj + a2B|)bp/(b| - b^) (1 - b^) , (II.C.2.6) 

1 > b| = ^{s + (s^ - > b^ = ^{s - (s^ - > 0, (II.C.2.7) 

s = (1 - a^)B^ + (1 - ct2^^2’ (II.C.2.8) 

r = (1 - a - a.)B?B^. (II.C.2.9) 

1 <1 1 2 

Proof : 

For the NLAR(2) model specified by (II. C. 1.1), (II. C. 1.2) and 
(II. C. 2. 11 ) - (II.C.2.9), let 4 >y((i 3) and (p (oj) be the characteristic 

A C 

functions of the {X^} and {e^} sequences. If {X^} is stationary, then 

<t)^((i3) = 4)^((ij){a^(})^( B^o)) + + (1-a^-a2)}. (II. C. 2. 10) 

Assuming a standard Laplace marginal distribution for the 

independent distribution of {e^} has a characteristic function, possibly 
improper, given by 



4 i^( w) 



( 1 +6^(1)^) ( 1 +B|uj^) 

(1+uj*) [( 1-a^ - 02 ) B^Blfj^” + {(l-a^)B^ + (1-a2)B|}w^ + 1]* 

(II. C. 2. 11) 
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It is convenient to factor the quadratic in in the 
denominator of (II. C. 2. 11). The roots of this factor are both real and 
distinct. To see this, note that 



{(l-a^)B^ + (l-a2)6|l" - 4( 1-a^ -a^) 



= {(l-a^)B^ “ > O' 



The roots are also both negative, which can be seen by noting that the 
product 1 /(I "01^ 02) 6^^ B| is positive, but the sum r^ + r2 

= -{(l-a^)B^ + ( 1-02) B2I /( 1~a^ -02) 6^ B| is negative. 

Letting r^ = -1 /b^ and r2 = -1/b^, we can rewrite (II. C. 2. 11) 
using partial fraction decomposition to obtain 






(II. C. 2. 12) 



From (II. C. 2. 11) 



and 



b| + b^ = (l-a^)B^ + (1-02)6! 



b^bl = (l-a^-a2)6*B! = r. 



= s 



(II. C. 2. 13) 



(II.C.2.1U) 



Comparing (II. C. 2. 12) and (II. C. 2. 11) term for term also yields 



and 



+ Po(1-t>!) = + 0123: 



(II. C. 2. 15) 
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( II. C. 2. 16) 



^ P3(^“*^3)’^2 ° (a^ +012) 6|6|. 

Expressions for b|, b^, P2 and p^ are obtained in terms of , 
a^, 6^ and by solving (II. C. 2. 13) ~ (II.C.2.16). From solving 
(II. C. 2. 15) and (II.C.2.16) simultaneously, we obtain (II.C.2.5) and 
(II.C.2.6). Equations for b| and b^ given in (II.C.2.7) are obtained 
from solving (II. C. 2. 13) and (II.C.2.1M) simultaneously. Arbitrarily, 
let b| be the larger value. 

It remains now to show that the inversion of (II. C. 2. 12) will, 
in fact, yield a function that is a probability density and is the 
mixture of densities for scaled Laplace variables. To do this, we show 
that P2 and p^ can be interpreted as probabilities and that P2 + p^ < !• 

To establish that P2 + p^ < 1 » we have, after adding (II.C.2.5) 
and (II.C.2.6) 



(a^6^+a26|) ~ 

^2 ^ P3 ° (1-b|) (1-b^) 



( II. C. 2. 17) 



Multiplying out (l-b|)(l-b|) and using (II. C. 2. 13) and (II. C. 2. 14), we 
have, after some rearrangement 



P 



2 




(i-ep(i-6|) 

(1-6p(1-6|) + a^6f(1-B|) + a 26 |(l- 6 p * 



(II. C. 2. 18) 



This expression is clearly positive and less than one, from 
which it follows that P2+P2<’'* 
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To show that and are probabilities, it remains to show 
that they are both positive. To do this, it is shown that the 
numerators and the denominators of (II.C.2.5) and (II.C.2.6) are 
positive. For the denominators, this requires that 0<b| , b^ <1, which 
is shown by noting 0<(1-bp (1-b^<1 . From (II. C. 2. 17) and (II. C. 2. 18), 
it follows that 

(1-b|)(1-b^) = (1-6;)(1-6|) + a^Bjd-e;) + a^B^d-Bj) > 0. 



Also, 



1 - d-bp(l-b^) = (b| + b^) - b|b^ 

= (l-a^)B^ + (1-02)62 “ ( 1-a^ -02) 6^ B2 

= (l-a^)B^d-Bp + ( 1 - 02 ) 6|(1-B|) + 6^B2 > 0. 



Therefore, b| and b^ are less than one, so P 2 and p^ have 
positive denominators. 

To see that P 2 and p^ have positive numerators, note that it 
must be true that 



b^ < b 



(o +a_) B?B^ 

__J ± ! £ < k2 

(a^B;+a 26 p 2 * 



(II. C. 2. 19) 
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Using (II.C.2.8) and (II.C.2.9), (II. C. 2. 19) is equivalent to 



-(s^-4r)^'^^ < 2b - s < (s^-Hr)^^^, 



or 



s^ - Up > (s-2b)^, 



or 



sb - - r > 0. 



(II. C. 2. 20) 



But the lefthand side of (II. C. 2. 20) is 

a^a 26 fe|( 6 f- 6 |)" 

(a^ B^+a^Sl) ^ 



which is strictly positive. 

Therefore, and p^ are both positive and p^+P^^I* Therefore, 
p^, P^ and l-p^'P^ can be regarded as probabilities. Therefore has a 
proper density which can be generated as the mixture of three Laplaces 
with scale parameters 1, [b^l and [b^l, respectively. Q.E.D. 

The general NLAR(2) model uses the four parameters to achieve a 
wide range of sample path behavior. Figure II. C. 2.1 illustrates four 
different realizations of the NLAR(2) process. In each case, the 
theoretical aut ocor r el at i ons are identical with p(1) = 0.6^1 and 
p(2) = 0.5. Also, note that each sample path was generated from the 
same i.i.d. standard Laplace sequence such that (X^ ,X^) = (L^,L 2 ). 

Since this is not the steady state bivariate distribution of (X ,X , ) , 
the sample paths illustrated in Figure II. C. 2.1 are displayed beginning 
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NLAR(2): SAMPLE PATHS; p(l) = .64 AND p(2) = ,5 
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Figure II. C. 2,1. MLAR(2) ; Sample Paths; p{l)=.64 and p(2)- 



with to avoid the initial transient behavior of the process. The 

50 1 

true value of each parameter is displayed above the corresponding sample 
path. Figure II.C.2.2 contains the scatter plots for each sample in 
Figure II.C.2.1. The sample size in each plot is 2000. 

Many special cases of the NLAR(2) model could be mentioned. The 
following have one or more of the parameters at their boundary value and 
have valid but less complicated results for the distribution of in 

(II.C.2.4). If “ *^2 ~ i.i.d. sequence {L^} and 

X = e . If a, = 1 then {e } is the innovation of the LAR(1) model 
derived from (II. B. 1.7) and (II. B. 1.8). If | 6 J = | 

< 1 then each is distributed as a scaled Laplace random 

variable, / l-a^-a^ L^. These models are called the TLAR(2) models, 
which are easily extendable to higher-order autoregressions, as 
discussed in Section II. E. If <1 and = 0 or ^2 ~ then is 

the innovation of the new first-order autoregressive model NLAR( 1 ) . 
This model is the subject of Section II. D. 

3 . Autocorrelation Structure 

In this section, it is shown that the autocorrelations 

p(il) = Corr (X^ , X, = 0,±1,±2,... of the NLAR(2) model satisfy the 

Yule-Walker type difference equations; thus the second moment dependency 

aspects are indistinguishable in form from those for the AR(2) process. 

We also compare the admissible regions of an AR(2) with (i) an NLAR(2) 

with ^ parameters and (ii) an NLAR(2) with only two parameters. 

From the independence of {K } and {K', K"}, and (II. C. 1.1), 
^ n n n 

(II. C. 1.2) and (II. C. 2.^1), we see that E(K^) = , E(KJ^) = and 
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NLAR(2): SCATTER PLOTS; p(1) = .64 AND p(2) = .5 

a,=.542,a2=.35.B, = 1..B2=.4375 a, = .665.a2=.286,B, = .815.B2=.535 
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Figure II.C.2.2. NLAR(2) : Scatter Plots; p(l)=.64 and p(2)= 



E(e ) = E(K )E(L ) = 0. Multiplying (II. C. 1.1) on both sides by X „ we 
n n n r ^ o j 

have for il ^ 1. E(X^X^_^) = 6^ E(X^_ ^ X^_^ ) > “2^2^^^n-2^n-H ^ * 

Dividing by Var (X^ ) we have p(-Jl) = a^S^p()l - 1) + - 2), since 

p(-J,) = p(Jl). Substituting for i = 1,2 and p(0) = 1 , we have 

p( 1 ) = a^ + a^pC 1 ) 

p(2) = a^pd) + a2» (II. C. 3.1) 

which are the same equations as those which occur for the AR(2) process. 

Since |g^| ^ 1 for i = 1,2 and = 1 in NLAR(2), the usual 

triangular admissible region for AR(2) given in Box and Jenkins 
[Ref. 23: p. 6l ] shrinks to the interior of a diamond- shaped area in 

(a^ = a^ = “ 2 ^ 2 ^ coordinates: |a^| + la^l = 1* (See F igur es 

II.C.3*1a and 1b). In (p(1), p(2)) coordinates the equation 

p(1)^ = (1 + p(2))/2 defining allowable combinations of p(1) and p(2) in 
AR(2) also changes. For NLAR(2), the space in (p(1), p(2)) coordinates 
becomes a triangular region bounded below by |p(1)| = ^ + p(2)}. (See 

Figures 1 1. C. 3 -2a and 2b). 

The reduction in allowable parameter or correlation combinations 

for NLAR(2) over the AR(2) model is not large. This encouraged us to 

consider a 2-parameter NLAR(2) model by specifying = g? , for i = 1,2, 

so that a^ = g^. The parameter space in (a^ ,a^) coordinates becomes the 

3 /2 

symmetric region bounded by the curves g| = ± (1 - gp (see Figure 
II. C. 3 . 1c). In (g^, g^) coordinates the admissible region of the two 
parameter model is bounded by the unit circle g| + g| = 1 . Using only 
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BOUNDARY OF ADMISSIBLE REGION IN PARAMETER COORDINATES 

o; LINEAR AR(2) MODEL b: NLAR(2) MODEL WITH 4 PARAMETERS 
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Figure II. C. 3.1. Boundary of Admissible Region in Parameter Coordinates 

for Linear AR(2) and NLAR(2) Processes 



POINT PLOTS OF ADMISSIBLE REGION FOR p(1) AND p(2) 

o: LINEAR AR(2> MODEL b: NLAR(2) MODEL WITH 4 PARAMETERS 





{Z)d 




{zy 
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Figure II.C.3.2. Point Plots of Admissible Region for p(l) and p(2) for 

Linear AR(2) and NLAR(2) Processes 



two parameters leads to the admissible region in Figure II. C. 3. 2c for 
(p(l), p(2)) space. The (p(l), p(2)) space was obtained by transforming 
the lines = a^ = c , - 1 ^ c ^ 1 , in Figure II.C.B.Ic to p(2) - 

( 1 -a 2 >p( 1 ) where |p(l)| i a^/d-a^) = 6^V(l-6p and 6’ - 

if a^ ^ 0 and if a^ < 0. 

All the plots in Fig. II. C. 3.1 were generated from a grid of 
equally spaced values of a^ and a^. In Fig. II. C. 3. la the points 
satisfy the Yule-Walker equations (5.1). In Figs. II. C. 3. 1b and 1c, the 
points also satisfy the conditions of Theorem 1. In Fig. II.C.3.2 the 
feasible combinations of p(1) and p(2) are plotted for those values of 
a^ and a^ from Fig. II. C. 3.1 using the Yule-Walker equations (5.1). 

4 . Directional Moments and Partial Time Reversibility 

In the last section, we demonstrated that the second moment 
dependency aspects of the NLAR(2) model were indistinguishable in form 
from those of the ordinary AR(2) model. Also, it is well known that if 
the linear autoregr essi ve model is not Gaussian, then the process is not 
completely determined by the first and second moments. Thus in model 
identification it becomes necessary to examine third order moments to 
further identify the process. Special third order moments E(X^ 
for all t, are known as directional moments. If the directional moments 
for all I are equal, which is necessary for a process to be fully time 
reversible, we say the process is partially time reversible in the sense 
of directional moments. 

A process is fully time reversible (Lawrance [Ref. 27]) if the 

joint distribution of , ..., is the same as that for X^^^, 

X ,, ...,X for all r and for all n. Since LAR(1), a special case of 
n+r-1 ’ ’ n 
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NLAR(2), is not fully time reversible, NLAR(2) is in general not time 
reversible . 

In this section we show by induction arguments that all the 
third order moments of NLAR(2) are the same as those for Gaussian AR(2) 
model; i.e. E(X.X.X ) = 0 for i, j, k . This implies particularly that 

1 J k 

the directional moments of NLAR(2) are equal and therefore that NLAR(2) 
is always partially time reversible. 

In Section II. B, we found that E(Xp = 0 for all i since X^ is 
marginally Standard Laplace. It is easy to establish the following two 
equations: 

" ^2“2^^^n^n-1 (II.C.4.1) 

E(X^X^ ,) = {(e^a-)/(1 -2e.6„a.a„)} E(X X^^ .). (II.C.4.2) 

n n-1 22 1 2 1 2 n n-1 

Solving (II.C.4.1) and (II.C.4.2), simultaneously yields E(X^X^_^) = 

E(X^X , ) = 0. 
n n-1 

Now, using separate induction arguments and the stationarity 

assumption, we establish that E(X X^ .) = 0 for all £ S 1 , and 

n n- £ 

E(X^X , ) = 0 for all k 2 1 . 
n n-k 

The proof of E(X^X^_^^) = 0 is straightforward. 

To prove E(X^X , ) = 0, we first show that the expectation of 
n n~k 

special third order moments of the form X X ,X , for k ^ 2 is zero. 

n n-1 n-k 

Define p = E(X X ,X , ) and assume E(X^X .) = 0, j 2 k - 1. From 
k n n-1 n-k n n-j 

(II. C. 1.1), 
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“k ■ «Vn-lVk> - “l6,'^»nV(k-l)> ^ “2“2«Vn-lV(k-l)> 

( Il.c.y. 3) 






< a ^'<-1 
(02^2 



Now from (II. C. 4.1) and (II.C.4.2), we have 



= E(X^X^_^) = a2^2^^^n^n-1 ^ ° Therefore y^ = 0. 

We now proceed to show that E(X.X,X ) = 0 for all i, j, k. 

1 J k 

Without loss of generality let i < j < k so that k=i+n,j=i+m 

and n > m. Therefore by stationarity E(X.X.X, ) = E(X.X. X. ) = 

^ ^ ijk ii+mi+n 

E(X.X. , ,X. ). Fixing m so that 0 < m < n we use induction on n. 

1 i-(n-m) i-n 

Let n = 2, implying m = 1 . The first step in the induction follows from 

E ( X^ X . _ ^ X ^ _ 2 ) = P2 = 0 - Next assume that for m < n ^ K, 

E(X.X. , ,X. ) = 0. Now we show that E(X.X. , ,X. , J = 0. 

1 i-(n-m) i-n i i-(K+1-m) i-(X+1) 

Using (II. C. 1.1), we write 



^^Vi-(K.1-m)^i-(K.l)) = “ieiE(X..,X._(^^^.^^X._(^^^)) 

+ “2®2^^^i-2^i-(K + 1-m)Xi-(K+D ) 

+ E(e.X._(j^^^_^^X._^j^^^j). 

Now E( e. X . _ _^^X. ) ) = E(g^)E(X._^j^^^_^^X._(j^^^ J ) = 0 and 

E(X,_^X._(^^^_^)X._(^^^)) = E(X.X..(^_^)X._^) = 0 by stationarity and 

the assumption. Likewise = 

E(X.X. f/„ iX. ,v) = 0. This completes the induction. 

1 i-{(K-1)-m} i-(K-l) 

An immediate result from the argument about third moments is 

that Z = X - X , for {X } of the NLAR(2) is not skewed, 
n n n-1 n 
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The residual analysis in [Ref. 6] and [Ref. 22] using cross 

correlations between linear autoregr essi ve residuals R = X - a.X , - 

n n 1 n-1 

a.,X , and their squares R^, does not shed any new light on the 
2 n-2 n 

directionality/reversibility in the NLAR(2) model or help in identifying 

the appropriateness of the Laplacian model. This is because all third 

moments have zero expectation. Thus, we see that E(R^R .„) = 

n n + X, 

E(R R* „) = 0 for all Jl. 
n n+ii, 

Note that the basis for the residual analysis using the 

process is that this process is uncorrelated but not necessarily 

independent. The moment results show that the have zero skewness. 

In fact, it is easy to show that the distribution of R^ is the same as 

the distribution of -R . Thus the R 's are symmetric although they 

n n 

will, of course, not have Laplacian distributions. 

In Chapter IV of this thesis, a residual analysis based on 
certain fourth-order moments is presented. 

D. THE NEW LAPLACE FIRST-ORDER AUTOREGRESSIVE MODEL, NLAR(1 ) 

1 . Introduction 

The new Laplace first-order autoregressi ve model is another 

special case of the NLARMA(p,q) model when q=0 and p=1 . This is, of 

course, a special case of the NLAR(2) model, where either and/or 3^ 

are zero in (II. C. 1.1). Examples of the different sample path behavior 

obtainable from the NLAR( 1 ) Process are given in Figure II. D. 1.1. Note 

that each sample has the same value of lag-1 serial correlation, i.e. 

p(1) = Corr(X ,X ,). In Figure II. D. 1.2 are the correspond! ng scatter 

plots for the samples in Figure II. D. 1.1. In the scatter plot labeled, 

"a,=1", the distinctive regression line, x = px , is clearly visible 
1 ° ’ n n-1 
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NLAR(1): SAMPLE PATHS; p(l) 



CD 








- s 



- 5 
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Figure II. D, 1.1. NLAR(l) : Sample Paths; p(l)=.64 



NLAR(1); SCATTER PLOTS; p(1) 
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Figure II. D. 1.2. HLAR(l) ; Scatter Plots; p(l)=.64 



for the LAR(l) process. This is produced as explained in Section II. B, 

because the innovation, e^, can be zero with non-zero probability. 

The t wo“ par am et er autoregressive model generates an {X } 

n 

sequence which satisfies 



X 

n 



K’B.X + £ , 
n 1 n-1 n' 



= K’6,X , 

n 1 n-1 



K L 
n n 



(II. D. 1.1) 



where 



K’ = 
n 



and 



1 w.p. 

0 w.p. 1-a^ 




1 w.p. 1-p 

(II. D. 1.2) 

/T^lBjLn w.p. p^ 



P 2 = a^B*/{1 “ (1-a^)6^}. 



(II. D. 1 . 3 ) 



Also, , {K^}, {L^} are i.i.d. sequences independent of each other 

and independent of X ., X ...... 

n-1 ’ n-2 

From (II. D. 1.2) and (II.D.1.3)> we see that the inversion of the 

— “I /2 — 1 

characteristic function for e^, letting X = (1-a^) > gives 

for 0<a^<1 



^^*^2^ -Ixl ^^2 -Xlxl 

f^ (x) = 2 e 1^1 ^ e '^1^1 , (II.D.1.U) 

which is a convex mixture of Laplace densities. This result also 
follows directly from Section II. C. 3. since the NLAR( 1 ) model is an 
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NLAR(2) model. Likewise, the correlation structure and partial time 
reversibility in the sense of directional moments are the corresponding 
results for the NLAR(2) model with a2=0 or That is 

Corr(X ,X , ) = for all k = 0, ±1 , ±2, ... (II. D. 1.5) 

n n“k 

and 

E(X^X , ) = E(X X* ) = 0 for all k = 0, ±1, +2. (II. D. 1.6) 
n n+k n n+k 

We can rewrite (II. D. 1.1) as 



X 

n 



n ^ j“1 

e I IT K’ ,)e^^. 

" j=1 1 i=o " " " J 



(II. D. 1 .7) 



From (II.D.1.1), it is clear that X depends only on X , and e . From 

(II. D. 1.7), we see that is independent of ^or all kSO. Hence 

{X^} is a first-order Markov process and starting X^ with a standard 

Laplace distribution makes stationary. 

The remainder of this section is devoted to specific results for 

the NLAR(1) process which have not been shown in the more general 

NLAR(2) model. The extension of these results to the NLAR(2) process 

would require the joint distribution of {X ,X , ,X which has not 

^ ^ n n- 1 n-2 ’ 

been derived. The conditional density of X given X , is derived, as 

n n-1 

well as an expression for the Joint distribution of the X^ . The 

distribution for the differences Z =X -X , is also derived. Parameter 

n n n-1 

estimation is discussed in the context of moment estimators and least 
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squares using the linearized residual. The problems with finding the 

maximum likelihood estimators of and are also addressed. 

2. Conditional Density and the Joint Density of (X^ X^) 

To find the conditional density of X^ , given we use 

(II. D. 1.1) “ (II. D. 1.4) to evaluate P(X <x |X ,). We have for a,<1, 

n n ' n'^ 1 1 

which eliminates the LAR(1) process, 



P(X <x lx , ) = P(K' e,X + e <x |X ) 
n n' n-^l n 1 n-1 n n' n-^l 



= a P(e <x -3.x ) + (1-a,)P(e <x ) 

I n n I I 1 n n 



X -3,x , 

n 1 n-1 



[ f (x)dx + (1-a ) [ f (x)dx. (II. D. 2.1) 



Differentiating (II. D. 2.1) with respect to x^ yields the following 
expression for a^<1, 

f„ |„ (x„lx„^.) = a.f^ (x -3,x„,,) + (l-a.)f^ (x„). (II.D.2.2) 

XX.n'n-1 1e n1n-1 1e n 

n ' n- 1 n n 

Examples of (II.D.2.2) for a fixed x , and fixed y = a,3, = .64 are 

n-1 11 

given in Figure II. D. 2.1. 

Now we can write the joint density f^ v (x , x . ) as the 

X X , n n- 1 
n n- 1 

product fy ly (x |x ^,)fY (x ^,). In fact, the n-dimensional 
A A „ n n.“" 1 A ^ n*" i 

n' n-1 n-1 

distribution of X^,...,X^ is obtained using this product recursively to 
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CONDITIONAL DENSITIES IN THE NLAR(l) PROCESSES 
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Figure II. D. 2.1. Examples of Conditional Density of NLAR(l) 

Process for a.< 1, |3n l< and a 3^ = .64 



obtain the density 




> • • • > 





(II.D.2.4) 



3 . Distribution of Differences and P(X ,>X ) 
n^1 n 

We now consider the distribution of the difference Z = X ‘X , . 

n n n^^l 

Using (II. D. 1.1) (II. D. 1.4) and the fact that e is a convex mixture 

n 

of Laplacian random variables, we used partial fraction decomposition to 
invert the characteristic function of to obtain the following 
expression for the density: 



f^, (y) = exp{-‘|y |/(1^B^ )} 
n 



a^(1'‘6^)| [■ P 2 





6,(2-6j 



a 



(1-aJ 



+ exp(-^ |y |/o) (ap 2 / 2 ) - 



2 



(1^6. ) ^ 



lexp(-‘ |y I ) 




(1-a^)(1-P2) 




+ 



2 






(T‘a^) |y |exp(-‘ |y | )/4, 



(II. D. 3 . 1 ) 



where 0 ^ = (1-"a,)6? 
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One immediate result is that f (y) is symmetric about zero and 

Zi 

n 

therefore, P(Z^<0) = P(Z^>0) = 1/2. This demonstrates one additional 

feature of the partial time reversibility of the NLAR(1) models; i.e., 

probabilities of a run down (X >X . ) and a run up (X <X ) are the 

n n-“l n n-^l 

same. To evaluate probabilities of higher order runs would require the 
joint distribution of the sequence {Z }. This result has not been 
obtained for the NLAR(2) model. 

4. Estimation of Serial Correlation 
a. Introduction 

The purpose of this section is to present estimators of the 

two parameters and 6^ whose product is the correlation coefficient in 

the NLAR(1) models. We assume throughout this section, unless otherwise 

stated, that has a standard Laplace (p = 0, X = 1) marginal 

distribution. Estimation of p and X for models that have marginal 

Laplace distributions are discussed in Chapter III. We also only 

consider the random coefficient models of the NLAR(1 ) process, i.e. a<1 , 

thus eliminating the LAR(1) model. As was shown in the introduction to 

this chapter, for a^=1, 3^ can be estimated very efficiently, thus 

eliminating the need for further discussion. 

The method of moments is used first to find an estimator of 

Y = The Joint moment estimators of and 6^ are calculated from 

f ourth'^order moments. These estimators are used later in an iterative 

procedure to obtain the joint least squares estimators of and 3^ . 

A least squares estimation procedure is defined for the 

NLAR(1) models using the usual linear residual R = X ^a.3.X 

n n 1 1 n-“1 
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Minimizing the sum of R* leads to the usual estimator of Y as given in 
standard texts on time series. In order to estimate and 6^ 
individually, we minimize the square of a particular function of R^ with 
respect to and . 

In the last part of this section, the problems of maximum 
likelihood estimation in the NLAR(1 ) process are discussed. Although no 
results are presented for the general model, the maximum likelihood 
estimator of the correlation coefficient in the TLAR(1) model is given, 
b. Method of Moments 

(1) Estimation of Y by Second^Order Moments . Since is 
assumed to have a standard Laplace distribution with E(X^) = 0 and 
Var(X^) = 2, an immediately obvious choice for estimating 
Y = Corr(X^,X^^^ ) is the following product moment; 



Y 



1 

2 



n 



I X.X 
i=2 ^ 



i-^l 



(n-1) 



(II.D.4. 1 ) 



Taking the expectation of Y and using (II. D. 1.1), we have 



E(Y) 



1 

2(n-1) 



n 

I 



1 






n 

I 



(II.D.4. 2) 



so that the estimator is unbiased. 

(2) Joint Estimation of a^and B^by Fourth^Order Moments. 
The expectation of f our th-^order moments can be calculated using 
(II. D. 1.1) and the fact that {X^} is a stationary process. For example 
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(II.D.4.3) 



E(X?Xi^l) = 12a^B^{1 + (2-a^)6f} , 

E(X?Xj^l) = 4(1+5a^B^^) , (II.D.4.4) 

E(XiX?^l) = 24a^B^ , (II.D.4.5) 

ECXjX^^^X.^^) = 4a^ B^ {1 +2a^ Bj+3a^ (2-^0^ ) Bj} . (II.D.4.6) 



Solving for and B^ in different pairs of these 
equations gives the estimators based on fourth-^order moments. It is to 
our advantage to use the expressions with the lower order moments where 
possible. Therefore, using E(X^X?^^) and E(X^X^^^) instead of 
E(X^X*^^ ) , we solve for the following expressions for the joint moment 
estimators of and B^ 



ot. = 



t X .X . 
1-2 ‘ ‘• 



(n-l) 



n 

I 

i=2 



(II. D. 4.7) 



x^x? , 
1 1^1 



4 ( n'^ 1 ) 



B. = 



i=2 

n 

10 y x.x. , 
i=2 



(II.D.4.8) 



From the scatter plot analyses in Figures II. D. 4.1 and 
II.D.4.2, we see an example of the behavior of this pair of estimators 
when = B^ = .8 in the NLAR(1) model. Both scatter plots contain 500 
pairs (a^,B^) derived first from samples of size 250 and then from 



62 



SCATTER PLOT TABLE 
X :AMA 

Y :BMA 

SELECTION :(|AMA) < 10 

X LABEL : MOMENT ESTIMATOR OF 
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Figure II. D, 4.1. Scatter Plot Analysis of Joint Moment Estimators of in the NLAR(l) 

Process for 500 Samples of Size 250 with a =3 =.B 
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Figure II.D.4.2. Scatter Plot Analysis of Joint Moment Estimators of (aj^,3j^) iri the NLAR(l) 

Process for 500 Samples of Size 2500 with a =3, = .8 



samples of size 2500. It is clear from the equations (II. D. 4.7) and 

A A A 

(II.D.4.8) that y = The hyperbola can be seen in both scatter 

plots. Both parts are visible for sample size of 250. However, for 
pairs derived from samples of size 2500, only the part in the first 
quadrant is visible. 

From the Normal probability plots in Figure II.D.4.3, 
there is little evidence of non-^Normality for 7 = for N = 250, and 

less for the estimator derived from samples of size 2500. However, 

A A 

individual estimators and 6^ look far less Normal for both sets of 
sample sizes. 

c. Least Squares Estimation in the NLAR(1) Process 

(1) The Linear Residual . The properties of the linear 
residual are developed for use in deriving the least squares estimators 
of Y = and for and B^ jointly. We begin by rewriting (II. D. 1.1) 

in the RCA(1) form as given in (II. C. 2.1). We have 

"n ■ “I'lVl * (n.D.M.9) 

From this expression, there are clearly two ways to write down the 
linear residual, R^. The usual one from linear theory is, of course 

R = X -a,B,X , . (II. D. 4. 10) 

n n 1 1 n“ 1 
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MOMENT ESTIMATOR OF 0| MOMENT ESPMATOR OF 8t MOMENT ESTIMATOR OF a,B, 

Figure II.D.4.3. Normal Probability Plots of Moment Estimators of and in the 

Process for 500 Samples of Sizes 250 and 2500 



However, a particularly useful way of looking at it is from 



6,(K-c.,)X 



+ e . 
•1 n 



(II.D.M.1 1 ) 



It is from (II. D. 4. 11) that we see explicitly how the i.i.d. innovation, 

{e }, and the coefficient {K''“a,} processes impact on the linear 
n n 1 

residual. 

be the o'‘algebra generated by [ { ( K ^ ) , } ; 

k = 1 , . . . ,n-‘1 ] . Intuitively, , represents all the information 

about the process up to time n^‘1 . Conditioning on , we have the 

following two useful properties of as noted by Nicholls and Quinn 
[Ref. 16: p. M2]. 



E(R 

n 




) = 0 . 



(II.D.M.12) 



E(R 



"I 

n ' 




) = B?Var(KMx^^. + Var(e^) 

1 n n-^ 1 n 

= ( l-'a^ ) + 2(1-‘a^Bp 



(II.D.M.13) 

(II.D.M.IM) 



These results follow because X , is a function only of 

n'^l 

the process through (n^^l ) and ) and are both independent of it. 

(2) The Least Squares Estimator of Y = a,B,. Using R from 
2 1 1 n 

(II.D.M.10) and a given sample from we obtain the least squares 

n 

estimator by minimizing the sum R? with respect to the product oi.B^ 

i=2 ^ 

which is now called Y. We have 
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/V 



(II. D. 4. 15) 



y 



n 



I X.X 
1=2 ^ 



1^1 



n 



l X 
1=2 



2 

1-^1 



> 



which, In fact, Is the usual expression for the estimation of serial 
correlation In linear AR(1 ) models as given, for example. In Chatfleld 
[Ref . 31 : p. 66] . 

Since the NLAR( 1 ) process Is an RCA(1) process of 

Nlcholls and Quinn, It follows from their theorem [Ref. 16: p. 44] that 

^ 1 
Y Is strongly consistent, asymptotically unbiased and (Y^Y) has an 

/IT 

asymptotic Normal distribution. The asymptotic variance, from the same 
results of Nlcholls and Quinn, Is 



ol = 1 + 6(a^g^)^. (II. D. 4. 16) 

Figures II .D . 4 . 4^11 . D. 4 .7 contain the boxplot analysis 
of SIMTBED [Ref. 15] output for selected choices of and 3^ In the 
simulation of the least squares estimator of the product In the 

NLAR(1) processes. Note that although the estimated asymptotic mean Is 
the true value, Y = = .64, for each of the four sets of the 

parameters, the estimated asymptotic variance of the estimator of 
a^B^ = Y Is different for each of the four different sets of 
parameters. The simulation results reflect the asymptotic theoretical 
results for the NLAR( 1 ) processes as given above. 
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SAMPLE SIZE (N):20000 NO. OF REPLICATIONS (M): 10 




oo >- 



I I 

UiWOV » 
«0««0<V(M AO • 

\o»rur 
fomsooiO fy<»v- 




lT\ 







• • >ocN« «o 

O . . oo — m 

iTk OCXS I <n o CO 

OJ (HNOO ON 

OONff ON 
tnoitr\ <n 

I I 

00 

1 I 



CO 

»-<M AT 

OO tntnh^ Cv 
I I 0«OUN ^0 

UlUtSTM <neOO On 
fOONdVOr^ NOtTir* 
roir«oir<co ... AT 
co^osoa^ OOO 
V0S0«- . • I 

ON . . oo 

fM OOO I 



•-fNI 

oo 

■ I oo 

bJUiNO*- I I 
COO^—CO UMd 
fWNrxM«n *»> 

fNir*-<«A<x»> eooo- CO 
'Of'>^ • . o«>.^ *- 
O • • 0 »- r««<o*A «n 
o OOO I cndv- ON 
•« N«>««<N< tn 

OOO O* 




I I 

UJUCOO 



r4«-f«oco 



• • oo </> 

OOO I >- 

z 



u 

<M — 

o w.. 

I lM/> .. </> 

UilTisO UJH-U) 
ON«XfV>^ OZ>- 2 
ouz uj 
^OrwxfN — UJ — 
vO*-F- . . lO— O 

O • • -oo — O — 

IT> OOO I t/H* — U. 



fM 

O 



»r» • • o 

CM OOO I 



Owu. ui 
<OUJ o 
KUO U 
UJ L> 

> » I 
< I 



ZV>V) 

OUiV) 

— KUi 

wiooc 

d)uiO 



K 

< 

> 



< 

COW 

CONI 



V)</> 



QC flC 
Ow 
UJOU. 
K O 



O 

z 

o 



Z</X/) UJ — 

<d>— U.u> V) 

WWCO OZW </> 
ZZO <0 UJ 



Z Z)- z— QC 

<OOwK <QCO O 

Ui>— W< >— W 

Z(0</)(/):ic XX/) a 



0) 

J-l 

:3 

Cn 

•H 

t. 



69 



II.D.4.4. SIMTBED Boxplot Analysis of Least Squares Estimator of T“Ct-|^3-|^ 
LAR(l) Process with y=.64 
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SAMPLE SIZE (N);20000 NO. OF REPLICATIOMS (M): 10 
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Figure II.D.4.6. SIMTBF.D Boxplot Analysis of Least Squares Estimator of with 

a =B =.0 in the NLAR(l) Process 
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Figure II.D.4.7. SIMTBED Boxplot Analysis of Least Squares Estimator of y=a^3^ with 

Y=.64 in the TLAR(l) Process 



An analogous result is given in Section III.E.M, where 

the theory of least squares is derived for the Beta-^Laplace AR(1) model. 

(3) The Joint Least Squares Estimation for and 6^. It 

n 

is not possible to minimize 'l R? with respect to a and 6 individ^ 

i-2 ^ ^ ' 

ually. However, a technique from Nicholls and Quinn [Ref. l6: p. 43], 

which uses the result in (II.D.4.13) is applicable. As was pointed out 

earlier in Section II. C. 2, by assuming nothing about the particular 

marginal distribution, Nicholls and Quinn were free to treat the 

variances, and a^, , as completely independent parameters subject only 

to the constraint that the marginal distribution of whatever it 

is, has a positive variance. Then, given (II. D. 4. 13), it was possible 

n 

to estimate and by minimizing the sum of squares ^ S? where 
e K .^2 ^ 



S = 
n 




n-^l 



(II.D.4.17) 



and R^ = (X ■‘TX ,)^ and T is from (II. D. 4. 15). They derive the 
n n n*“ 1 

properties of the trivariate distribution of the estimator of 

(X, a^, a^,)* 

Since and a^, are related parametrically in a, and 
6^, the results in [Ref. 16] concerning the variances do not apply in 
the NLAR(l) process. However, we can form from (II. D. 4. 13 ) and 
(II.D.4.10) an analogous expression for 
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2( 6^) , 



(II.D.H.18) 



S 

n 



?2 



a,8?(1^ajX 






1 ' n-1 



where the product a, 6, in R is not replaced by Y from ( II.D . 1 5) . 

1 1 n 

In terms of a sample from we define the joint 

A A 

least squares estimators of and to be those values and that 
minimize 



n 

I 

1=2 



{(X.' 






( 1- 



•a 






2(1- 



ai^i 






(II.D.H.19) 



where (II.D.M.19) is the sura of the squares of given in ( II . D . . 1 8 ) . 
Now it is clear that (II. D. 4. 19) is a highly nonlinear expression in two 
unknowns, and 6^. A given numerical technique could converge to a 
local extremum, a saddle point, or diverge depending on, among other 
things, the starting values for estimating ot^ and 6^. 

Constraining the nonlinear optimization problem given 
by (II.D.4.19) to the rectangle within which the NLAR(1) process is 
defined'^'^O ^ S 1 and "=1 ^ ^ 1 '‘'‘eliminates the divergence problem, 
but clouds the estimation issue regarding the boundary models LAR(1) and 
TLAR(1). We try an unconstrained approach described below. 

(4) An Unconstrained Nonlinear Optimization of (II.D.4.19) . 
It is easy, but tedious, to write the normal equations from (II.D.4.19). 
One critical point is at = 6^ = 0. After factoring from the one 
equation and 8^ from the second, several iterations of the Newton'‘ 
Raphson method (see, for example, Gerald [Ref. 28: pp. 122'‘128]) can be 
performed to find other critical points. The Newton'‘Raphson method uses 
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a second^order Taylor series approximation to solve the non-linear 
system by a set of linear Jacobian equations. However, one needs to 
calculate the four second partial derivatives from (II. D. 4. 19) and to 
have a good starting point on the surface. 

The IMSL routine ZSPOW solves systems of non-linear 
equations for one root using modified Newton methods. This routine was 
used to solve the unconstrained problem of finding and from sets 
of data from simulated NLAR(1) processes. The routine was very senstive 
to starting values and did not always converge even when the sample size 
was as large as 2500. It also did not perform well when the true 
correlation coefficient, Y = o^B^. was small for any of the simulated 

I k I 

NLAR(l) processes with the same autocorrelation function, Y' '. This 
problem is highlighted by the fact that (II.D.4.19) is constant along 
the line = 0 and the line B^ = 0. 

As an illustration of the performance of the routine, 
500 sets of sample sizes 250 and 2500, respectively, were generated from 
the NLAR(1) process with = B^ = .8. The scatter plot analyses in 
Figures II. D. 4.8 and II. D. 4.9 show how the estimators and B^ 
determined by ZSPOW are related. Especially for the samples of size 
250, there is the same pattern of the hyperbola as seen in the moment 
estimators of and B^ given in Section 1 1 . D . 4 . b . ( 2 ) . From the 
accompanying tables, it is clear that the variance of the marginal 
distributions for each estimator and B^ is decreasing with increased 
sample size. The Normal plots of the empirical marginal cumulative 
distribution functions for and for B^ appear very non-Normal even 
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Figure II.D. 4 . 8 . Scatter Plot Analysis of Joint Least Squares Estimators of (a^, 3 ^) in the 

Process for 500 Samples of Size 250 with a = 3 , = .8 
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from estimators derived from samples of 2500. On the other hand, the 
Normal plots of Y = indicate that the distribution is converging to 

a Normal distribution as required by theoretical results of the 
previous subsection. (See Figure II. D. 4. 10). 

It is convenient, at this point, to summarize the 
results on the moment and least squares estimation of Y = and 

(a^,8^) in the NLAR(l) processes. 

In the estimation of Y, only second'^or der product 
moments are required for both methods. From the Normal probability 
plots in Figures II.D.4.3 and II. D. 4. 10, it appears that both estimators 
of Y are converging to Normal distributions. Although the moment 
estimator of Y is unbiased (the least squares estimator is 
asymptotically unbiased), the variance of the moment estimator of Y is 
considerably larger than that of the least squares estimator of Y. 

The estimation of and 6^ requires f ourth-^order 
product moments for both methods. The variance of the moment estimators 
of and are too large, even for samples of size 2500 to be useful 
in distinguishing between NLAR(1) processes. The least squares 
estimators of and B^ have smaller variances than the corresponding 
moment estimators and could be useful in distinguishing between NLAR(1) 
processes. However, as pointed out above, the numerical routine to find 
the critical points does not always converge for a given starting value 
of and B^ • The conclusion is that neither method of estimating 
and B^ is very satisfactory. 
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The median 



(5) The Median (X./Xj^_^^) Estimator of Y = a^B^. 
of (X^/X^^^) was seen to be extremely efficient in the LAR(1) process. 
It also makes sense in the context of maximum likelihood estimation in 
LAR(1). This is discussed in the next section. 

Simulation results confirm the conjecture that the 
median (X^^/X^^^ ) is not a robust estimator of Y for departures from the 
LAR( 1 ) process. In fact, from the boxplots in Figures II. D. 4. 11 - 
II. 0.4.14 of SIMTBED output for four NLAR(1) processes, the estimators 
seem to become more biased as 6^ approaches one — corresponding to the 
other boundary process, TLAR( 1 ) . Even for the small size of the 
simulations, the standard deviation of the mean is small. For the three 
non-LAR(l) models, the asymptotic estimates of the mean of Y given in 
the data are each significantly different from the theoretical value of 
Y = .64. 

d. Method of Maximum Likelihood 

(1) Introduct io n. The logarithm of the likelihood 
function, L(a^,6^), is obtained by taking the natural logarithm of the 
n-dimensional joint density given in (II.D.2.4) and treating it as a 
function of and for a given realization of length n from We 
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Figure II. D. 4. 11. SIMTBED Boxplot Analysis of Median (X^/X^_^) Estimator of Y=“-|^02^ with 

Y=.64 in the LAR(l) Process 
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Figure II. D. 4. 12. SIMTBED Boxplot Analysis of Median (X^/X. Estimator of with 

a, =.9 and 3, =.71 in the MLAR(l) Process 
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II. D. 4 . 13 . SIMTBED Boxplot Analysis of Median (X^/X^ 
a =B =.8 in the NLAR(l) Process 



SAMPLE SIZE (N):20000 NO. OF REPLICATIONS (M): 10 DEGREE OF REGRESSION (0): 
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II. D. 4, 14. SIMTBED Boxplot Analysis of Median (X^/X^_j^) Estimator of with 

Y=.64 in the TLAR(l) Process 



have 



n 

L(a^,e^) = -n(ln 2 ) - |xj + J, i.n[a^ ( 1-P2)exp{-|x^-6^x^_ J } 

+ (1-a^ ) (1-P2)exp{-|x J p^Xexp { -X | x^-6 J } 

+ (1-a^)P2Xexp{-X|xJ }], (II.D.4.20) 

1 

where p^ was given in (II.D.13) and X = • 

Maximizing (II.D.4.20) in the general NLAR(1) model is 
not accomplished here for two reasons. First, L(a^,6^) is not 
differentiable with respect to' at any of the n values B^ = x^/x^_^ 
for i = 1,...,n, because of the terms |x.-B^x^_^|. A bivariate search 
routine that does not use derivatives is needed. 

Second, L(a^,B^) is not defined along the line = 1 
at any of 0 ^ k ^ n values of B^ such that -1 < B^ = 1 • To 

see this, examine the third term of the natural logarithm in 
(II.D.4.20). We have replacing X for all i = 2,...,n 



/(l-a^)B^ 

Because of the presence of the exponential term in (II.D.4.21), the 
limit as approaches one is zero, so long as B^ * x^/x^_^ . The limit 
does not exist on the set B = (B^|B^ = xVx^^_^; i = 2 n}. 



.-|x.-BiX._i 



/(l-a,)B? 



(II. D. 4.21 ) 
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It is worth noting that for = 1 , corresponding to 
the LAR( 1 ) model, and except on the set B, ( II.D. il. 20) , can be written 
as 



L(1,e^) = -nO!,n2) - |xj + (n-1 ) i!,n( 1 -gp 
n 

- I |x -6 X I , B d B. (II.D.i).22) 

i=2 

Now Jln(l-e^) is maximized at = 0 and the optimal value for 
n 

I |x^-g^x^_^| is the least absolute deviation (LAD) estimator of 6^ 

which is the weighted median of (x^/x^_^ ) where the weights are x^_^ for 

i = 2,...,n. Thus, if after a large number of observations from {X^} no 

repeats of x./x. , are observed, then there will be little difference 

between a particular LAR( 1 ) model and the completely random model of 

i.i.d. Laplace variables. In this case, for any 3^ in a small deleted 

neighborhood around 3^ = med(x^/x^_^ ) , (II.D.H.22) will be large because 

n 

both in(1-3p and will be optimized. 

i=1 

(2) The Maximum Likelihood Estimator of in the TLAR(l) 
Processes . In this section, the likelihood function for the TLAR( 1 ) 
process is described. The maximum likelihood estimator is found using a 
numerical iteration scheme. The properties of the estimator are 
investigated and compared to the least squares estimator using 
simulation . 
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For the TLAR( 1 ) models (6^ = 1 or 6^ - -1), (II. D. 4. 20) 
can be written as a one-dimensional function of the a variable a. V/e 
have 



L(a) = -n(Jln2) 



n 

+ I Uni 

i=2 



/1 



exp 



/r-aV 



+ 



/l-a, exp} 



(II, D. 4. 23) 



where 



X. 

1 


X 

1 


a 


> 


0 , 


X. 

1 


" Vi 


a 


< 


0 , 



(II.D.4.24) 



-1 < a < 1 and = | oc| • 



Now L(a) is continuous everywhere in the open interval 



(-1,+1) and differentiable everywhere except at a = 0. The expressions 
- dL(a) ^ d^L(a) , ^ u k 

for — T and — : — 5 — af'e lengthy and cumbersome to use; hence are not 

da da 

given here. 



Examples of the likelihood curve are given in Figures 
II.D. 4.15 - II.D. 4 . 18 . Each curve was generated from a sample of 100 
from a simulated TLAR(l) process with the stated a^ and 6j. It is easy 
to see the non-dif f erentiable point at zero and how flat the curve is. 
To see that there is a maximum (annotated with x on the figure) in these 
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TLAR(1); LOG-LIKELIHOOD FUNCTION; a, = .5 AND = -1 SSN = 100 





88 



Figure II. D. 4. 15. TLAR(l) ; Log-Likelihood Function; a^=.5, ^^=-1 and SSN-100 



TLAR(1): LOG-LIKELIHOOD FUNCTION; a, = .1 AND B, = 1 SSN = 100 



rO 




(»)n 
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Figure II. D. 4. 16. TLAR(l) : Log -Likelihood Function; a^=.l, and SSN-100 



TLAR(1): LOG-LIKELIHOOD FUNCTION; a, = .64 AND B, = 1 SSN = 100 



00 




o 
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Figure II. D. 4. 17. TLAR(l) : Log-Likelihood Function; a =.64, 3,=1 and SSN=100 



TLAR(1): LOG-LIKELIHOOD FUNCTION; a, = .9 AND B, = 1 SSN = 100 
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Figure II. D. 4. 18. TLAR(l) : Log-Likelihood Function; Oj^=.9, and SSN=100 



curves, the second part of each figure focuses on the function near the 
true value of . 

The IMSL routine, ZXLSF, a one-di mens i onal search 
routine was used to find the value of a that maximized ( II. D . ^1. 23) . The 
starting value a was the least squares estimator of serial correlation 
given by (II, D. 4. 15). 

Using 500 samples of sizes 50 and 500, respectively, 
from simulated TLAR( 1 ) processes with Y = .6^1, the scatter plot analyses 
in Figures II. D. 4. 19 and II.D.4.20 were completed. The least squares 
estimator and maximum likelihood estimator appear to be correlated. 
From the accompanying tables, the maximum likelihood estimator appears 
to have a smaller variance and bias than the least squares estimator. 
Analysis of the boxplots from a SIMTBED comparison of the least squares 
estimator and the maximum likelihoood estimator reflect the same results 
(see Figures (II.D.4.21 - II. D. 4. 22). 

From the Normal plots given in Figure II. D. 4.23, both 
the least squares and the maximum likelihood estimator appear to be 
coverging to a Normal distribution. There are three or four outliers in 
the tail out of 500 points. 

E. OTHER CASES OF THE NLARMA(p,q) MODEL 
1 . Introduction 

A primary advantage of the NLARMA(p,q) model is the ease with 
which the basic framework can be altered to cover a variety of different 
dependency structures. The NLAR(2) and NLAR(1) processes have been 
examined closely in the previous sections of this chapter. At this 
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SCATTER PLOT TABLE 
X : AMLE 

Y :ALS 

SELECTION :ALL 

X LABEL :MAXIMUM LIKELIHOOD ESTIMATOR 



q: 
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Fiqure II. D. 4. 19. Scatter Plot Analysis of the Maximum Likelihood and the Least Squares 

Estimators of y in the TLAR(l) Process for 500 Samples of Size 50 with 
a =.64 and 6, =+l 
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Figure II. D. 4. 20. Scatter Plot Analysis of the Maximum Likelihood and the Least Squares 

Estimators of y the TLAR(l) Process for 500 Samples of Size 500 with 
a^ = .64 and 3,=+! 
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Figure II. D. 4. 21. SIMTBED Boxplot Analysis of the Maximum Likelihood Estimator of y 

with y=.64 in the TLAR(l) Process 
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Figure II. D. 4. 22. SIMTBFD Boxplot Analysis of the Least Squares Estimator of y v^ith 

Y=.64 in the TLAR(l) Process 
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Figure II.D,4,23. Normal Probability Plots of the Maximum Likelihood and the 

Least Squares Estimators of y in the TLAR(l) Process for 
Samples of Sizes 50 and 500 v;ith y=,64 



time, the moving average first-order model, NLMA(1), and the mixed 

model, NLARMA(1,1), are briefly considered. The correlation structure 

and parameter space are discussed for each model. 

The TLAR(I) model for which the maximum likelihood estimation 

was completed, can be easily extended. As the final part of this 

t h 

section, we present the p -order autoregressive processes for arbitrary 

p ^ 2. The conditions for existence and uniqueness, the correlation 

structure and likelihood function are given. The maximum likelihood 

estimation scheme for the p parameters is also discussed. 

2. A Backwards MA(1) Model, NLMA(1) 

a. Correlation Structure of the NLMA(l) Process 

From (II. D. 1.1), we see that is the random coefficient 

sum of independent variables each of which have a marginal Laplace 

distribution. Therefore, we can replace by another Laplace 

variable. If it is independent of L and has a standard Laplace 

marginal distribution, then by the construction, will still have a 

standard Laplace marginal distribution. 

If we replace X . , in fact, by L .in (II. D. 1.1), we 
n-1 •' n-1 ' 

obtain the following expression for X^ 

X = K’e^L^ . + K L^ , (II. E. 2.1) 

n n 1 n-1 n n 

where and are as given in (II. D. 1.2) and {K^} is the 

corresponding two-valued discrete variable as given in (II.C.2.11) for 
the NLAR(2) model. 
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Since X _ is by construction in (II. E. 2.1) independent of 

for |k| ^ 2 , we see that the model has the cut off property of a 

linear MA(1) model. The maximum range of correlations in any MA(1) is 

less than or equal to |l/2|, (Fuller [Ref. 29: p. 62]). This range is 

achieved by the linear MA(1) models. Some of the random coefficient 

MA ( 1 ) models have been shown to have a maximum range for the 

Corr(X ,X , ) to be strictly less than one-half (see Hugus [Ref. 30]). 
n n~l 

Using (II. E. 2.1) recursively, we derive the serial 
correlation in NLMA(1) as 



Corr(X ,X , ) = 
n n-1 



Cov(X^,X^ .) 
n n- 1 

Var(X ) 
n 



E{(K’6,L +K L )X , } 
n 1 n-1 n n n-1 









. E{L ,(K' ,6,X _.K ,1, ,)), 

2 n-1 n-1 1 n-2 n-1 n-1 



= a^6^E(K^_^). (II.E.2.2) 

Substituting in the values of the i.i.d. sequence ^ } with the 

corresponding probabilities P 2 « "^~^2 (II. D. 1.3) we have 
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Corr(X^,X^_^ ) = a^6^{(1-P2) + )6^ P 2 } 



“ 1^1 



1-6^ + B^/( 1-a^ ) 6^ 



1 - (1-a^)6| 



(II.E.2.3) 



Figure II. E. 2.1 is a contour plot of the level curves for 

0 ( 1 ) = Corr(X ,X .). Notice that in this model, the correlation is 
n n" 1 

restricted in range over that of the linear MA(1) models. Using the 
IMSL global constrained optimization routine, ZXMWD , with multiple 
starts, the extremes for lag-1 serial correlation are |p(1)| ^ O. 11026 , 
occurring at = .903 and 6 ^ = ±.690. In Chapter III, we give a 
continuous random coefficient model with MA(1) correlation structure, 
Laplace marginal distribution, and the full range of correlations, i.e. 
|p(1)| ^ 5. 

b. Invertibility in NLMA(1) 

It is well known (Chatfield [Ref. 31 » P* ^3]) that if 



X = Z + 6,Z , , (II.E.2.4) 
n n 1 n-1 

is a linear MA(1) model, then substituting (1/B^) in for does not 
change the autocorrelation function. This implies that the linear MA(1) 
model is not uniquely determined by its autocorrelation function. 

It is also well known (Chatfield [Ref. 3 I : P- ^31) that by 
successive substitution, the MA(1) model in (II.E.2.4) can be written as 
the infinite autoregression 
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Figure II. E. 2.1. tlUlAd) : Contour Plot of the Feasible Region for p(l) in Parameter Coordinates 



Z = X - 6,x + B?X _ - +... 

n n 1 n-1 1 n-2 



(II.E.2.5) 



Likewise, if 1/8^ is in (II.E.2.4), we have 



Z =X - i X + iiX ^-+. 
n n 8^ n-1 8^ n-2 



(II.E.2.6) 



Unfortunately, only one of the two processes given by (II.E.2.5) and 
(II.E.2.6) yields a convergent power series depending on whether 
1 8^ 1 < 1 or not. Hence, the restriction on 8^ called "invertibility" by 
Box and Jenkins [Ref. 23: p. 50], guarantees a one-to-one 

correspondence between a linear MA(1) model and its autocorrelation 
function by restricting 6^ to be such that the MA(1) "inverted" infinite 
autoregression is the one with a convergent power series representation. 

This definition of invertibility is not totally applicable 
to random coefficient models (such as NLMA(1)) with MA(1) correlation 
structure because it has not been established that there exists a 
corresponding infinite autoregr ession model. 

Likewise, there can be an infinite number of models that 
have the same autocorrelation function and marginal distribution. This 
is the case in NLMA(l). As was seen in Figure II. E. 2.1, each contour 
line corresponds to a constant value of p(1) and is achievable by an 
infinite number of combinations of (a^,8^). 

The purpose of this section then, is to find a different, 
but meaningful, way to restrict the (a^,8^) rectangle in Figure II. E. 2.1 



102 



which: (1) does not further restrict the range of p(1): and (2) which 

within the region the NLMA( 1 ) model must be uniquely determined by p(1) 
and either or • 

From the contours in Figure II. E. 2.1, it appears that the 
feasible region for p(1) can be partitioned in such a way that the two 
goals stated above can be achieved. It is not known, however, if this 
partition can be described analytically. Figure II.E.2.2 is an 
illustration of the partition into a center region and two complementary 
disjoint regions. The center region is roughly defined as the region to 
the right of a line from (-1,.667) to (-.577,1 ) and to the left of a 
line from (.577,1 ) to (1,.667). Both lines cut across the contours in 
the depression on the left and on the ridge on the right. The center 
region is more advantageous for two reasons. First, p(1) is a 
continuous function of and 6^ in the center region. Secondly, the 
parameter estimation is more likely to be easier if the most extreme 
values of and 6^ can be avoided simultaneously. Theref ore , we shall 
call the center region of Figure II.E.2.2 the "principal" region. 

3. A Mixed Autoregressive-Moving Average Model, NLARMA(1,1) 

From the theorem in Section II. C. 2, we see that any two 
(possibly dependent) Laplace variables can be combined with an 
independent set of (again, possibly dependent) Laplace variables to form 
another Laplace variable. Using this property, if we replace 
NLAR(2) by L ,, then the marginal distribution of (X } is still 
standard Laplace. We have then 
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Figure II.E.2.2. NLMA(l) ; Boundary of Principal Region in Parameter Coordinates 



(II. E. 3.1) 



X 

n 



B.K’X , 
1 n n-1 



6„K"L + K L , 

2 n n-1 n n 



where {K',K"}, (L }, {K } are as previously defined, 
n n n n 

Notice that if is identically zero, corresponding to = 0, 

we obtain an expression of the form given by (II. E. 2.1) for NLMA(1). 

Likewise, if K” is identically zero, we have the NLAR(1) model as given 
n 

in (II.D.1 .1) . 

The NLARMA(1,1) model has the same correlation structure as the 
linear mixed model ARMA(1,1). Using (II. E. 3.1), 



E(X^X^ ,) = a,6.E(X^ .) + ,X ) 

n n-i I 1 n-1 2 2 n-1 n-1 

+ E(X ,K L ) . (II.E.3.2) 

n-1 n n 

But X , , K and L are independent so 
n-1 n n 

* V2«‘-n-,‘-n-2> * 

Conditioning on K^_-| » using the independence of and 

(X^_ 2 , L^_-|) and dividing by the Var(X^) we have 

p(l) = 0^6^ + a262(1-P2"P3‘"h2l^2'"l^3l^3^ ’ (II. E. 3-^) 
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where p^. P 1^21. defined in (II.C.2.5) through (II.C.2.9). 

For I > 2, (II.E.3.3) and (II.E.3.4) become 

E(X X -) = a.BiECX^ .X^ ,) (II. E. 3.5) 

n n-x, 1 1 n-l n-x, 

and 

pOl) = a^B^p(jl-l). (II. E. 3.6) 

These equations are the same as those of the ARMA(1,1) model 
(see Chatfield [Ref. 36 : p. 58]). However, the range of correlations 

is significantly reduced over that of ARMA(1,1). Figure II.E. 3 .I 
represents a side-by-side comparison of the (p( 1 ), p( 2 )) space for 
NLARMA(1,1) and the familiar linear ARMA(1,1). Although p(1) can range 
from -1 to + 1 , the combinations with p( 2 ) are severely limited in 
NLARMA(1,1). The minimum p(2) in NLARMA(1,1), found numerically using 
the reduced gradient method is approximately -.025 at p(1) = ±.2. As 
I p( 1 ) I increases, p( 2 ) approaches p( 1 )^. 

4. Higher Order Autoregressive Models, TLAR(p) 
a. Introduction 

It has been stated by Raftery [Ref. 32] that there exists 
NEAR(p) models for p ^ 2. Also, Nicholls and Quinn [Ref. 16] have given 
conditions for the existence and uniqueness, strict stationary, etc., 
for general RCA(p) models. However, only for the NEAR(2) and the 
NLAR(2) processes has it been shown explicitly what the necessary 
innovation is; and that it is a valid random variable. 
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POINT PLOTS OF ADMISSIBLE REGION FOR p(l) AND p(2) 
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Figure II. E. 3.1. Point Plots of Admissible Region for p(l) and p(2) for Linear ARMA(1,1) and 

NLARf'IA ( 1 , 1 ) Processes 



For p S 3, this has not been accomplished for the general 
NEAR(p) process; nor is it done now for the NLAR(p) process. However, 
there are 2^ different p^*^ order autogressive models with p parameters 
that are special cases of the NLAR(p) process. These models are called 
the TLAR(p) models. The innovation for the second-order model was given 
without proof following the theorem in Section II. C. 2. The likelihood 
function and maximum likelihood estimation of was given in Section 
II.D.iJ for the TLAR(1) processes. 

The TLAR(2) models, including the two TLAR(1) models only 
account for four of the infinite number of NLAR(2) models which all have 
the same AR(2) correlation structure and standard Laplace marginal 
distribution. Since there is a variety of different sample path 
behaviors obtainable in the general NLAR(2) model, it is possible that a 
TLAR(2) model will not always be the most appropriate model for a given 
set of data. 

However, as is shown in the remainder of this section, the 
TLAR(p) models have an advantage over the general NLAR(p) models. The 
TLAR(p) processes for p ^ 3 exist; are easily constructed; are partially 
time reversible; and are parsimonious with respect to parameters. The 
parameters in the TLAR(p) process are easily estimated from the 
conditional likelihood function by the method of maximum likelihood, 
b. Existence and Uniqueness 

The TLAR(p) models p ^ 1 have the form 



P 

= I 

i = 1 



n n-i 



"n’ 



(II.E.il.1) 
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where is assumed stationary with Laplace marginal distribution; 

\ . . . ^ is a p-variate discrete random variable independent 



K = 
n 



1 ’ V 2 


For 


all n 


(1, 0, 0,..., 


0) 


w.p. 


(0, 1 , 0, ..., 


0) 


w.p. 


• 




• 


• 




• 


• 




• 


0 

0 

0 


1) 


w.p. ap 



(0, 0, 0,..., 0) w.p. 1 - E a. = X > 0, (II.E.4.2) 

i = 1 ^ 



so for all i = 1,...,p. The 2^ choices of model arise from 

the selection of signs for each of the (either +1 or -1). 

Now if {X^} is stationary, then the following expression for 

the characteristic function of the i.i.d. innovation, e , follows from 

n 

(II.E.4.1) regardless of the choice of signs on (The distribution 

of a symmetric random variable Z is the same as that for -Z) . We have. 



P n -) 

= E[exp{-io)( I Vi^^n^^^’ 
■^n i=1 



P 

= i)> (<*>) C I a. (py (w) + X], 
n 1=1 n-i 
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= (J) (u) [(1-A) + A], 

n 

from conditioning on K , the stationarity assumption of {X } and the 

n n 

independence of e of X ,, X Therefore, substituting from 

n n-1 n-d 

(II.B.1 .2) 









(1-A) 

1 + 0 )" 



A 



= 1 /(1+Au)") . 



(II.E.4.3) 



For A > 0, (II.E.4.3) is recognized as the characteristic function of a 

scaled Laplace random variable with scale parameter /a~. 

Since (II.E.il.1) can be written as 



X 

n 



I {a.X . 
. 1 n-i 

1 = 1 



(K^^^-a.)X 
n 1 n- 



. } + e 
1 n 



(II.E.4.4) 



and satisfies the conditions in Section II. C. 2, the TLAR(p) models are 

RCA(p) models. Since the innovation {e } and {K } are i.i.d., then 

n n 

TLAR(p) are strictly stationary and {X^} is the unique solution by the 
theorems of Nicholls and Quinn [Ref. 16: p. 31 and p. 37]. 

c. Correlation Structure 

t h 

The TLAR(p) models are p -order autoregr essi ve in the sense 

that E(X lx , = X,, X - = x„,..., X = x ) is a linear function in 
n' n-1 1 n-2 2 n-p p 

, i = 1,...,p. It is also autoregr essi ve in the sense that it 



no 



satisfies a set of Yule-Walker equations. Multiplying by 



X, ^ 1 , and taking expectations, we have 






+a E(X X „). 
p n-p n-JL 



(II.E.4.5) 



Dividing by Var(X^) and substituting % = 1,...,p into (II.E.4.5), we 
have the set of equations 



p(1) = +. . . +app(p-1 ) 

p(2) = a^p(l) + a^ +. . . +app(p-2) 

p(p) = a^p(p-l) + a2p(p-2) +...+ a^ , (II.E.iJ.6) 

where a. = a. (Sign of X .) for all i = l,...,p. 

For the TLAR(2) cases, the (p(1), p(2)) admissible region is 

the entire diamond given in Figure II.C.3.1. It is divided, however, 

into four right triangles, one per quadrant, corresponding to the sign 

of X , and X „ in the model. 
n-1 n-2 

d. Conditional Density of X |X , ,X „,..., X 

■' n ' n-1 ' n-2 n-p 

The conditional density for each of the 2*^ specific choices 
of signs are easily found noting that the conditional probability is 
just a sum of Laplace cumulative distributions. We have 



in 



< ^l\-i - ^1' ^n-2 " ^2’***’ \-p “ 

= P(K^^^x, + +...+K^P^x + /T“L < x) 

n 1 n 2 n p n 



= a,P(/r”L < x-x, ) +...+ a P(/T”l < x-x ) + AP(/T"l < x) 

1 n 1 p n p n 

fx-Xi 1 fx-x I 

- “if| t *•••* (“)• (II. E. 1.7) 

' ‘■(/r ) p '-(/r ) ■'(/r) 

where F, (•) is the cumulative (ii stri but i on function of a stan(Jard 

Li 

Laplace random variable. Taking derivatives with respect to x, we have 




(II. E. 4.8) 

e. Conditional Maximum Likelihood Estimation of (a.,..., a ) 

1 P 

Since there are many p-variate Laplace distributions that 
(Xp,...,X.|) could have, and that the particular one is not known to us, 
it is not possible to form the exact likelihood function which is 
written 



f 



X 



n' 



.X 



1 





X 



i-1 ’ 





(II. E. 4.9) 



Instead, we can calculate the conditional log-likelihood 
function as the logarithm of the product of the first (n-p) terms in 
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(II.E.4.9). This is commonly done. See, for example, Priestly 
[Ref. 33: p. 350]. Using = assign ) , we have the following 

single expression for the conditional log-likelihood function, given the 
n realizations from TLAR(p) process, written as a function of a^ for 
i =* l,...,p, 



1 1 

L(a-,...,a ) = ^ in fy ^ 

^ i=p+1 1 ^i'^i-1 



n 

= I 

i=p+l 





(lI.E.iJ. 10) 



where 



V. . 

ij 



X. - X. . 

1 1-J 


if a. 


o 

All 


j = 1 » • • • J p 


J 




X. + X . . 


if a. 


< 0, 




1 1-J 


J 







(II. E. 4.1 1) 



i = p+1,...,n; a. = |a.| and \ are functions of the variable a.. 

J J J 

We see that when p = 1 (II. E. 4. 10) and (II. E. 4. 11) give the 
expressions used in the TLAR(1) process in Section II. D. 4. 

As a function of (a^,...,ap), (II. E. 4. 10) is continuous 
throughout the interior of the p-di mensi onal subspace on which it is 



defined. It is not differentiable with respect to a^ anywhere that 
a^ = 0. The maximization of (II.E.4.10) can be formulated as a 
constrained non-linear program for which a numerical routine would 
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probably be required to solve for (a^,...,ap), the Joint conditional 

likelihood estimator of (a,,... a ). 

1 P 
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III. CONTINUOUS RANDOM COEFFICIENT MODELS 
WITH SYMMETRIC NON-NORMAL MARGINALS 



A. INTRODUCTION 

The discrete random coefficient NLARMA(p,q) models studied in the 
previous chapter offered a variety of different dependency structures 
analogous to their linear ARMA(p,q) counterparts as described in the 
Box-Jenkins approach to time series analysis. These models, however, 
could be considered deficient in some ways. For one thing, all the 
models have, by design, the same marginal distribution, i .e . Laplace. 
To obtain a different marginal distribution would require starting over 
to develop the appropriate innovation sequence. Raftery [Ref. 32] has 
reported some results in extending the NEAR framework to other models 
with different marginals and ARMA correlation structures. 

Furthermore, the parameter estimation, which is easy to do in 
Gaussian linear AR(p) models, is not particularly easy in the 
autoregressive process of the NLARMA(p,q) family. In the moving average 
and mixed models of NLARMA(p,q), the maximum likelihood procedure is 
even more difficult. Raftery [Ref. 32] claims that the maximum 
likelihood estimator of S^in the NLAR(1) process would be super- 
efficient based on his work in parameter estimation in the NEAR(1) 
process and the extensions that he has proposed. Super-efficiency is 
not an attractive property of an estimator. 
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Again, the moving average model, NLMA(1), does not allow for the 
full range of correlations that are obtainable with the linear MA(1) 
model . 

Finally, note that there is another attractive property of the 
random coefficient models that is not fully exploitable in the di scr ete 
random coefficient models (NEAR(1) and NLAR(1)). That is, in the 
NLARMA(p,q) models the coefficients of the process can change somewhat 
over time and the process itself remains stationary. Andel [Ref. 3^] 
has noted that in many applications of time series analysis, 
particularly in the fields of hydrology, meteorology and biology, the 
coefficients of the model are attempting to describe complicated 
processes. The coefficients may have some random behavior of their own, 
apart from that usually attributed to the independent innovation 
sequence . 

If stationary constant coefficient models are not particularly good 
at modelling such systems (as suggested by Andel [Ref. 3^])t then the 
NLARMA(p,q) models would not be much better because the coefficients are 
limited to a finite (very small) number of possible values. However, 
Lawrance and Lewis [Ref. 6] have shown in the case of NEAR( 1 ) that it is 
possible to alter the character of the sample paths of a given low-order 
autoregression by extending the two-parameter model to one having 
parameters. The number of extra parameters could be excessive and the 
costs in parameter estimation unacceptable. 

In this chapter, a different family of stationary random coefficient 
time series models is introduced which retains many of the favorable 
aspects of the NLARMA family (specified marginal and correlation 
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structure) and offers alternatives in the areas pointed out above as 
disadvantages in the NLARMA construction. 

The symmetric marginal distribution can be specified by one shape 
parameter to be any one of an infinite number of non-Gaussian distri- 
butions. This family is the il-Laplace family and is examined in the 
next section. The f amily--including as a special case the double 
exponential (Laplace) distribution--has members with extremely high 
kurtosis, as well as those that have a limiting kurtosis that approaches 
that of the Gaussian distribution. This offers a significant advantage 
over the NLARMA models. 

Just as discrete random variables are needed for the coefficients in 
the NLARMA(p,q) models, the square roots of Beta random variables are 
used in this family of models to maintain the X, -Laplace marginal distri- 
bution. The square root Beta transf ormation theorem is the key result 
through which all the time series models in this chapter are formulated. 
By the theorem, Laplace variables are changed into those that have 
2,-Laplace distributions. Previous uses of Beta random variables in 
modelling non-Normal time series is evident in the models with Gamma 
marginals of Lewis [Ref. 35] and Hugus [Ref. 30]. 

The fact that the coefficients are continuous instead of discrete 
allows for a continuous variation. That they are functions of Beta 
random variables restricts the variation to a bounded interval. This is 
likely to facilitate the modelling of those "complicated" systems as 
described by Andel [Ref. 3^]* 
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The principal models investigated in this chapter are those with 

first-order autoregressive correlation structure. They are first-order 

Markov processes. For the purpose of discussing parameter estimation in 

this family of autoregressive models, as opposed to the NLAR(1) family, 

the focus is narrowed to that AR( 1 ) model of the family with Laplace 

marginals--the so-called Beta-Laplace First-Order Autoregressive model, 

BELAR(I). Several point estimators of location and scale are discussed 

and examined through simulation in SIMTBED [Ref. 15]. The one parameter 

which uniquely determines all the correlations of lag k in the BELAR(I) 

model can be estimated by a least squares procedure which has very nice 

asymptotic properties. The maximum likelihood estimator of serial 

correlation is also obtained using numerical methods. 

First-order autoregressive correlation structure is not the only 

type of dependency relationship that is obtainable from using the square 

root Beta-Laplace transformation. In the last section of this chapter a 

t h 

first-order moving average model and an extension to a q -order model 
are introduced. The MA(1) model retains the full range of correlations 
of the linear MA(1) models. This was not the case in the NLMA(1) model. 
B. H-LAPLACE DISTRIBUTION 

1 . The ]l-Laplace Random Variable 

It was shown in Section II. B that the standard Laplace 
distribution belongs to the class of infinitely divisible distributions. 
The probability density function of a Laplace distributed variable was 
given in (II.B.l). The characteristic function of the standard Laplace 
random variable was given in (II. B. 2). Thus if 
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)l > 0 , 



(III. B. 1.1) 



then X is a random variable. In fact it is the difference of two 
independent, identically distributed Gamma(Jl,1) random variables where 2, 
is the shape parameter and 1 is the scale parameter. Therefore, if X 
has a characteristic function given by (III.B.1.1), then X is an 
2, -Laplace random variable. 

Since (III.B.1.1) is a real function of oi, X is a symmetric 
random variable. It is easily verified that 



E(X") 



0 if n is odd, 

(k+1 if n = 2k, k 



(III.B.1 .2) 

1 , 2 ,..., 



[ k ] 

where b = b (b+1 ) . . . (b+k-1 ) for all b > 0. Since all odd moments are 
zero in (III.B.1. 2), the 2-Laplace distribution is not skewed for any 
2 > 0. From (III.B.1. 2) we find that the kurtosis is 



"2 = 



E(xS - (E(X^))^^ 

n n 

VarMX ) 
n 



( 22 )^ ^ 2 



(III.B. 1.3) 



The kurtosis approaches 3 as 2 ®, which corresponds to that of a 

Normal distribution. 

Since an 2-Laplace random variable, X(2), is the difference of 
two i.i.d. Gamma(2,l) random variables, we obtain the density for X(2) 
by using conditional expectations. 
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If G^(2,,1) and are the i.i.d. Gamma ( X. , 1 ) random 

variables, then conditioning on G 2 (J.,D, we have 



P(X<x) = P(G^-G 2 <x) = Eq {P(G^-G 2 <x|G 2 =g)} 

= Eg {P(G^<x+g)} 



(III.B.1.M) 



Since Gamma random variables are non-negative, 



P(G^<x+g) 



0 if g ^ -X, 

Fq (x+g) if g > X, 



(III.B. 1.5) 



where (x+g) is the cumulative distribution function of G, . The 

expressions are shortened from G^(2,,1) to G(2,,1), because they are 
i.i.d. Therefore, (III.B.1.M) can be written as 



g=oo 

P(X<x) = I F^(x+g)fQ(g)dy, (III.B. 1.6) 

g=L(x) 



where 



g^ ^exp(-g) g > 0, 

Til) 

0 otherwise, (II. B. 1.7) 

is the density of a Gamma (2,,1) random variable and again, because of 
the non-negativity 



fQ(g;il) = 
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0 if X > 0, 



L(x) 



-X If X < 0. 



(III.B. 1 .8) 



Differentiating (III.B. 1.6) using Leibniz' rule for the 



derivative of an integral with variable limits, we have, after some 
simplification 



analytically using integration by parts. If Z = 1 we obtain the density 
of the standard Laplace distribution. For 8, = 2,3, ^ the densities are 
also well known derivations given, for example, as textbook problems by 
Feller [Ref. 25: p. 64]. Feller however looks at the results of 

(III.B. 1.9) as the n-fold convolution (n = 2,3,4) of i.i.d. standard 
Laplace random variables. Figure III.B. 1.1 shows the densities of the 
2,-Laplace random variable for I = 1,2, 3, 4. Note how the graphs take on 
the shape of a Normal density with = 21. 

2. Numerical Evaluation of the 8, -Laplace Density 

If 8, > 0 and is not an integer, then (III.B. 1.9) must be 
evaluated numerically. We will be interested in the evaluation of the 
density in (III.B. 1.9) for 0 < 8, < 1 , in order to calculate conditional 
densities and likelihood function. 




(III.B. 1.9) 



Now if 8, is a positive integer, (III.B. 1.9) can be evaluated 
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^-LAPLACE DENSITIES FOR INTEGRAL X 

i=l (STANDARD LAPLACE) ^^=2 
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Figure III. B. 1.1. Examples of the i-Laplace Density for Integral Values of 

by Exact Evaluation of Equation III. B. 1.9 



Figure III. B. 2.1 displays examples of densities for non-integral 
% obtained by using the IMSL numerical integration scheme DCADRE to 
evaluate (III. B. 1.9). The upper limit of integration in (III. B. 1.9) is 
replaced by a suitable constant m > 1. Since for g > 1 and fixed % and 
u > 0, 



-J- (_! 

rMJi) (g(g+u) 



1-X, 



exp{-(2g+u)} 



, exp{-(u+2g)} 

r"Oi) 



(III.B.2. 1) 



then 



1 DCADRE -f^(u;)l) I < ^ ^ . (III.B.2. 2) 

Difficulty in integrating comes about because of the singularity 
at the lower limit of integration. If J, ^ 1 , this singularity 

1-j, j,-1 

disappears by rewriting (1/(g(g+p)) as (g(g+p)) . For J, < 1 , there 

are two alternatives for removing the singularity. We can transform the 

% 

variable of integration, g, to become t = g and the singularity at 
g = 0 is removed. Or, we could do an integration by parts to remove 
either the singularity at g = 0 for u > 0 or at g = -u f or u < 0 . In 
either case, the remaining integral must be evaluated numerically for 
u 0. 

Since X is a symmetric random variable we can rewrite 
(III. B. 1.9) using integration by parts to obtain an expression that will 
be easier to apply. For all u 0 
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-LAPLACE DENSITIES FOR NON-INTEGRAL i 

^=0 25 ^=0 50 
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Figure III. B. 2 . 1 . Examples of the Jl-Laplace Density for Non-Integral Values 

of a Numerical Evaluation of Equation III. B. 1.9 




exp(-2g)dg. (III. B. 2. 3) 



If J, ^ .5 note that is not defined at y = 0. For I > .5 and 

u = 0, 



3. The Square Root Beta-Laplace Transformation 

The principal result of this section is the proof of the so- 
called square root Beta-Laplace transformation theorem. By this 
technique, an -Laplace random variable can be transformed into an 
%^-La.^la.ce random variable where ^ The time series models 

developed in Sections III.C - III.F rely on the following: 



f^(0;?,) = r(2Jl-1)/{r*(S,) 2^*" ^ } < ® 



(III.B.2.U) 



Theorem: 



Let X - 2. -Laplace and B - Beta(o,2-a), where 0 < a < 2 and B is 

1 /2 

defined on the interval [0,1], i .e . standard Beta. If Y = B X, then 



Y - a-Laplace. 



Proof : 



By conditioning on B, we obtain the following expression for the 



characteristic function of Y; 
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1 /2 

= Eg[E{exp(ib Xoi) } ] 

= EgCU/d+bw^") I®”]. (III. B. 3.1) 



Since biu^ > 0, a convergent power series representation of 
(III. B. 3.1) is given by 



s (u)) = E { I 
^ ® k =0 



,[k] 

k! 



r 2^k.k, 

■(-wd b }, 



(III. B. 3. 2) 



where again = il( Jl +1 ) . . . ( Jl+k- 1 ) for k = 1 , 2 ,...; = 1 . 

Interchanging the expectation and summation in a convergent 
power series gives 



® fk 1 

^(^o) = I (- 0 )") E(B^) 

^ k =0 



(III. B. 3. 3) 



From Johnson and Kotz [Ref. 36: v. 2, p. HO], we have 



E(B*^) = for k integer. 



(III.B.3.H) 



Substituting (III.B. 3 .H) and (III. B. 3.3), we have 



[k] 






(III. B. 3 . 5) 

Q.E.D. 
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C. J, -LAPLACE FIRST-ORDER AUTOREGRESSIVE TIME SERIES MODEL 



1 . Introduction 

In this section, we exploit the square root Beta-Laplace 

transform to define a 2-parameter first-order autoregr essi ve model in 

S,-Laplace variables. The first parameter, I, determines the non- 

Gaussian symmetric marginal distribution of the time series ensemble. 

The second parameter, a, given the value of I, determines uniquely the 

lag-1 serial correlation. Since the model is shown to be first-order 

Markovian, a determines the entire autocorrelation function up to the 

sign. We show also that the models are always partially time reversible 

with respect to both runs probabilities and directional moments. 

Writing the stationary time series {X (H)} in the form of an 

n 

additive random coefficient equation, we have 

X U) = A^^^(a,«,-a)x ,(£) + H-a, a) L (H), (III. C. 1.1) 

n n n-1 n n 

where {X^OO} is assumed to be a stationary time series with a marginal 

1 /2 

i-Laplace distribution; {A (a,2,-a)} is an i.i.d. sequence such that 

n 

1 /2 

A^(a,H-a) is a standard Beta; ( B^ (S,-a,a)} is an i.i.d. sequence 

1 /2 

independent of {A (a,i-a)} such that B (i-a,a) is also standard Beta; 

n n 

and {L (H)} is an i.i.d. sequence, independent of the coefficient 
n 

processes, such that L (H) is 2, -Laplace. The coefficient A (a,2-a) and 
^ ’ n n 

B (J,-a,a) are assumed to be independent of X , t X , etc. If it is 

n n-1 n-2 

assumed that X ,()0 has a 2,-Laplace distribution, then by the theorem 
n-1 

in Section III.B.3 so does The fact that the process is 
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Markovian follows by construction. To start the process in the 
stationary distribution set X^Ol) = Lq(JI). 

The parameter space is H > 0 and 0 ^ a ^ Jl. 

For the Beta random variables A and B (hence their square 

n n 

roots) to be properly defined, each of the parameters must be positive. 

Hence, when a = 0 or a = Z, (III. C. 1.1) as defined above is no longer 

1/2 1/2 

appropriate because each of and B^ has a parameter that is 

1 /2 

identically zero. If a = 0, it is understood that the { A^ } sequence 

1 /2 

is identically zero and the { B^ } sequence is one; therefore, 

(III. C. 1.1) becomes ^^n^ sequence is the {L-^} 

1 /2 

corresponding to the i.i.d. Jl-Laplace case. For a = Z, A^ is one and 
1 /2 

B^ is identically zero; therefore, if = Lq(J,), then is 
J, -Laplace, but is not an ergodic process. 

If we let 



e = B^^^(H-a,a)L 00 
n n n 



(III.C.l .2) 



then by the Theorem in Section III.B.3, - ( J,-a) -Laplace with 

E(£ ) =0 and Var(e ) = 2(2,-a) for all n. Since the variance must be 
n n 

non-negative, it is also necessary that a ^ Ji. By the stationarity of 

1 /2 

{X } and since A (a,J--a)X , ( Jl ) is independent of e , the 

n n n- 1 ^ n 

characteristic function of the right-hand side of (III. C. 1.1) gives 



'^’XOO 



( w) 






(III.C.l .3) 
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Examples of sample path behavior for selected I and a are given in 



Figure III. C. 1.1. Note that although the correlation coefficient is 
approximately 0.8 for all sets of i and a in Figure III. C. 1.1, there is 
considerable difference in the sample path behavior as 1 changes. For 
the samples from small values of !l(.10 and .05), there are runs of 
values that are very nearly zero in magnitude. 

2. Correlation Structure 



Using equation (III. C. 1.1) recursively along with the 



stationarity and independence of the process w® have 



p(1) = Corr(X^Ol) ,X^_^ 00 



E{X^(ll)X^_^U)} 



E{X^_^00} 



E[{Ay^(a,)l-a)X^_^ 



E{X^_^00} 



E{ A^^Ca, H-a) }E{X^_^ 00 } 



= EtA^'^^Ca, i-a) }. (III. C. 2.1) 
n 



E{X^_^ (i)} 



From Johnson and Kotz [Ref. 36: v. 2, p. iJO], we have for 



A - Beta(a, 2.-a) , that 



n 



E(A‘"(a,H-a)) 

n 



r(a+r)r()0 

ru+r)r(a) 



for all n, r > 0, 



(III. C. 2. 2) 



where f(.) is the incomplete Gamma function. Therefore 
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BETA-LAPLACE AR(l); SAMPLE PATHS 




1 30 



Figure III. C. 1.1. £-Beta-Laplace AR(1) : Sample Paths; p(l)= 



p(1 ) = 



r(a+1 /2)r(Jl) ar(a+1 /2) r()l+1 ) 



ru+i/2)r(a) JirOi+1 /2)r(a+i ) 



(III. C. 2. 3) 



Note that as a -*■ then p(1) 1. Similarly as a ■* 0, p(l) -» 0. 

Therefore, we obtain a full range of positive correlations in a 

one-to-one function of a for any given value of 1. 

Also from (III. C. 1.1), we see that the process is explicitly 

autoregressive. It is also autoregressive in the sense of expectations 

in that E(X (S,)|x ,(Jl) = x) is a linear function of x. Since 
n ' n- 1 

Ik I 

( III. C. 1.1) defines a first-order Markovian process, p(k) = p(1)' ' for 
all k. To see this we write for all k 



p(k) 



ElX„(WXn.,U)} 

21 



1 /, ,U)X„ ^(11)) 

. ElAy^U.,t-a)) " ' 

n 2x, 



p(1)p(k-1) 



= p( 1 ) p(k-2) p( 1 ) 

= {p(i)}l'"L 



If we replace 



A 1 / 2 , A X 

A (a, l-aj 
n 



in (III.C.1 . 1) 



by -A 



1 /2 
n 



(a, H-a) 



we have 



r(g->-1 / 2 )r(il) ^ _ ar(g-^1 /2) fOl-M ) 
r(n+i/2)r(g) “ nr(Ji+i /2)r(g+i ) 



(III. C. 2. 5) 
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We can, therefore, achieve a full range of negative correlations, and 



likewise 



p(k) = (-1) '•'Upd)} 



for all k. 



(III.C.2.6) 



3. Partial Time Reversibility 

The )l-Laplace first-order autoregressive models are partially 

time reversible, both with respect to the directional moments, 

{X^(J,)X 00} for m = 0, ±1, ±2,..., and with respect to runs 

n n-m 

probabilities, P{X^(J0<X^_^ (Jl) } = P{ X^ (£) >X^_^ 00 } . 

Using mathematical induction, stationarity of {X^(2,)}, and the 

independence of the coefficients and the innovation from each other and 

previous values of {X^(Jl)}, it is the case that ^ ^ ^n-m ^ ^ 

= E{X OOX^ ( iO } = 0 for all n and for all m = 0, 1, 2 Let 

n n-m 

X^- J, -Laplace. For m = 0, E(X^) = 0 by ( III. B. 1.2). Assuming for m = k 

that E(X^X , ) =0, we have for m = k+1 after substituting from 
n n-k 

(III. C. 1.1) and (III. C. 1.2) that 



E{X^X ,0 = E{(A x^ +2A^"^^X ,e +eOX 

n n-(k+1) n n-1 n n-1 n n n-(k+1) 



= E(A )E{X^ ,X ,, ,0 

n n-1 n-(k+1 ) 



= E(A )E(X^X , ) = 0. 
n n n-k 



(III.C.3. 1) 



Assuming for m = k that E(X X^_ ) = 0, we have for m = k + 1 after 

n n K 

substituting again from (III. C. 1.1) and (III. C. 1.2) 
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(III. C. 3. 2) 



E{X n ^ } 
n n-(k+1 ) 



1 /P 

E{X* ,, , (A X +e )} 

n-(k+1 ) n n-1 n 






1 /? 

= E(A )E(X^ ,X ) = 0. 
n n-k n 



To see that this model is also partially time reversible with 

respect to runs probabilities, we show that the random variable = X^ 

- is symmetric. Now is symmetric if and only if the 

characteristic function of Z is real valued. We write 

n 

<p^(u)) = E[exp{io 3 (X^ - 

= E[exp[io3{e^-(l-Ay^)X^_^ }]] 



1 /? 

E{exp(iaje^)}E[exp{-io)(1-A^ ^^n-1 



EjECexp{-io)(1-a^^^)X^_^}]] 



1 U-a 



1 + 0 )^ 



i /i 1/2,2 2 

1 +( 1 -a ) 0 ) 



(III. C. 3. 3) 



Since (III.C.3.3) is real valued that concludes the proof. 
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D. THE BETA -LAPLACE AUTOREGRESSIVE MODEL, BELAR(I) 



1 . Introduction 

In this section, we set !l = 1 in (III. C. 1.1) and (III. 0.1. 2) to 
obtain the following expression for the BELAR( 1 ) process 

X = A^^^(a,1-a)X + e , (III. D. 1.1) 

n n n-l n 

where is an i.i.d. sequence with - ( 1 -a) -Laplace with moments 

and density given by (III. B. 1.2) and (III. B. 2. 3). X^ now has a standard 
Laplace marginal distribution. The only parameter in the model is cx 
with 0 s a ^ 1. All the results of Section III.C still hold with Jl = 1 . 
Examples of sample path behavior are given in Figure III. D. 1.1. 

We do two things in this section. First, we derive the 
equations for the conditional density of X IX ,. The second is the 
derivation of joint density and the logarithm of the likelihood 
function. The expression is used in Section III.E.6 to obtain the 
maximum likelihood estimate for a. 

2. The Conditional Density 

To find the conditional density of X |X . , we will need the 

n ' n- 1 

1 /2 

density of A (a,l-a). Let A be a standard Beta random variable with 
n n 

parameters (a,1-a). Since A^ is defined only on the given interval, 
zero to one 
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BELAR(I): SAMPLE PATHS 




- $ 



Vivo 





vivo 
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Figure III. D. 1.1. BELAR(l) : Sample Paths for Specified Values of a and 

Corresponding p(l)=Y 



P(A 



1/2 

n 



<a) 



P(A <a*) 
n 

0 



0 < a < 1 
otherwi se 




x=0 



(x;a)dx, 

n 



0 < a < 1 , 



where (x;a) is the standard Beta(a,l-a) density given 
n 



(x;a) 

n 



a“"\l-a)°‘/r(a)r(1-a) 0 < a < 1, 

0 otherwise. 



Differentiating (III. D. 2.1) with respect to a, we obtain 
expression for 



t /^{a-,0.) r(a)rd-a) 2. a 

A^ (1-a ) 



0 < a < 1 . 



Examples of (III. D. 2. 3) are given in Figure III.D.2.1. 

Now we evaluate P(X^ < x|X^_^=y) using (III.D.1. 

1 /2 

and (III. B. 2. 3). Conditioning on A^ (a,1"oi) we obtain 



(III.D.2.1) 



by 



( III.D.2.2) 
the following 



(III. D. 2. 3) 



, (III. B. 1.2) , 
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DENSITY OF FOR A^BETA(a, 1 -a) 
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1 /p 

P(X„<x|X„.,-y) . P{A^ (c.l-a)X„., . < xlx„.,-y) 



- E 1 X X - yA^'^^(a, 1-a) j A^^-a}] 

A 



= E ^ /2^ ^ J 



A n 

n 



a=L2(x) 

^ (x-ay) f da , (III.D.2.10 

I X X A 

a=L^(x) n 



where from (III. B. 2. 3) the cumulative distribution function of e can be 

n 

written as 



F (x-ay) 
^n 



. u=x-ay 

2 (u; 1 -a)du 

u=0 



u=ay-x 

[ f (u;l-a)du 
n 

u=0 



if x-ay ^ 0, 

(III. D. 2. 5) 



if x-ay < 0, 



and Lj^(x), i = 1,2 are the limits of integration on a which may be 
functions of x. 
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Since F (x-ay) changes definition for negative and positive 
^n 

(x-ay) and since 0 < a < 1 , we rewrite (111.0.2.^1) based on the ratio 
x/y, which is a constant. Thus 



P(X <x|X =y) = 
n ' n-1 ^ 



a=1 

f (x-ay)f ^ O'" 

•’„ ^n A 

a=0 n 



(III. 0.2. 6) 



a=x/y 

[ F (x-ay)f ^^^{a;a)6a 

_ ^n A 

a=0 n 



a=1 



+ f F (x-ay)f ^^^{a;a)da if 0 < x/y < 1 

^ ^ in A 



, 'n A 

a=x/y n 



Oif f erentiating (III.0.2.M) with respect to x using Leibniz' 
rule gives the following general expression for the conditional density. 
We have 



•■x |x 

n ' n-1 

a=L2(x) 

= 1’ f^ (x-ay;1-a)f ^^^(a;a)da 

a=L^(x) ^n 

n A 
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(III. D. 2. 7) 



-F {x-yL (x)f {L (x);a};^L (x) 
e 1 .1/2.1 Qx 1 

n A 

n 



From (III. B. 2. 3), (III. D. 2. 3) and (III. D. 2. 5) set 



h(g,a) . . (m.D.2.8) 

r^(l-a) r(a) (1-a^)“ (g+|x-ay|)' “ (1-a) 



Now using (III. D. 2. 7) to differentiate each expression in (III. D. 2. 6), 
we have the following explicit expressions for 



fx |x (x|y) - 

n' n-1 



a=1 g=“ 

j j h(g,a)dgda if x/y ^ 1 or x/y ^ 0 , 
a=0 g=0 



(III. D. 2. 9) 



a=x/y g=“ 

I J h(g,a)dgda 
a=0 g=0 



a=1 g=» 

+ I I h(g,a)dgda if 0 < x/y < 1 
a=x/y g=0 



It will be seen later that working with (III. D. 2. 9) will be 
inconvenient. Hence, we rewrite (III.D.2.9) as 
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3 = 1 

f { (x-ay) ; 1 -a}f ^^^{a;a)6a 



if x/y > 1 
or x/y ^ 0 




(x|y ) 



a=x/y 



a=0 



{ (x-ay) ; 1 -a}f ^^^(.a;a)da 
A 

n 



(III. D. 2. 10) 



a=1 






{ (x-ay) ; 1 -a}f ^ ^^(,a-,o.)(la if 0 < x/y < 1. 

, A 

a=x/y n 



The conditional density in (III. D. 2. 10) can assume different 

shapes as a function of x depending on the fixed conditioning value, y, 

and the particular, fixed a. If a = 0, then (III. D. 2. 10) becomes the 

standard Laplace density as given in (II. B. 1.1) with u = 0 and A = 1. 

If y = 0, then (III.D.2.10) becomes the ( 1 -a)-Laplace density as given 

in (III. B. 2. 3) with 2, = 1-a. In Figure III.D.2.2 are presented different 

examples of (III.D.2.10) for a fixed y and different values of a. Note 

that if a < 1/2 then (III.D.2.10) is continuous for all x. If a ^ 1 /2 

and X = y, (III.D.2.10) is undefined, e.g., x = y = 0. 

In a similar manner, expressions for ( III.D . 2. A)- ( III .D . 2. 1 0) 

can be derived for the BELAR( 1 ) model with negative correlat i ons . 
1 /2 

Placing -A^ (a, 1-a) in (III. D. 1.1), we replace x-ay by x+ay and 

determine the appropriate form of the conditional density based on the 
ratio (-x/y). We have for the negative BELAR(I) process 
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CONDITIONAL DENSITIES IN THE BELAR(I) PROCESSES 







142 



Figure III. D. 2. 2. Examples of Conditional Density of Given in the 

BELAR(l) Process 



fx |x (xly) ■ 

n ' n-1 



a=1 

f {(x+ay);1-a}f ^ 
a=0 n 

if -x/y S 1 or -x/y ^ 0, 



(III. D. 2. 11) 



a=-x/y 

f { (x+ay) , 1-a}f ^^^{a;a)(ia 

^ ^ n A 

a=0 n 



a=1 

+ f { (x+ay) ; 1 -a}f ^^^{aia)da 

^ , n A 

a= -x/y n 

if 0 < -x/y < 1, 



3. The Joint Distribution and the Likelihood Function 

An expression for the Joint density of can be written 



using f^ 1 ^ ^’^nl’^n-l^ ^X follows: 

n'^^n-l ^1 



n-1 

fy Y (^_»»**|X'i) — fy (X.) n fy 

n 1 1 k=1 n-(k-l)‘ n-k 



|x _ <V(k-i)IVk> 



(III. D. 3. 1) 



The log-likelihood function as a function of a given is just the 

natural logarithm of (III.D.3*!)* We have 



L(a) = -(In 2 + |xj ) + I ln{f. 



n-1 

k-i “V(k-n|Vk‘'"-<'<-'>l’‘''-'‘”' 



(III. D. 3. 2) 
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It is now a simple matter to determine which branch of 



(III.D.2.10) or (III. D. 2, 11) is needed for each pair 

substitute it into the sura in (III. D. 3. 2). We postpone further 
discussion of the likelihood function until Section III.E.6. 

M. Numerical Evaluation of the Conditional Density 
a. Introduction 

This section is devoted to explaining the methodology by 
which we came to resolve the problems in the numerical integration of 
the conditional density. This is as important an issue as the 
derivation itself, since the likelihood function and the maximum 
likelihood estimators can not be evaluated without it. As is pointed 
out below, the standard numerical routines were unsuccessful in 
accurately evaluating (III. D. 2.9) around the singularities in 
(III. D. 2. 8). We also give and justify the approximations that were used 
to remove each of the singularities. The graphs in Figure III. D. 2. 2 
were obtained using the method. The methodology was used again in 
Section III.E.6 to evaluate the log-likelihood function in the method of 
maximum likelihood estimation. 

In the FORTRAN routine that calculates the conditional 
density as given in (III.D.2.10), the approximations in (III.D.H.6), 
(III. D. 4.8) and (III. D. 4. 11) are added to the results from DCADRE. 
Combinations of these approximations are invoked as necessary depending 
on the ratio x/y. 

The same procedure is used to evaluate the density in 
( III. D. 2.11) for the BELAR(I) model which produces negative correlations 
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for odd lags. We Just check for 0 < -x/y < 1 and choose the appropriate 
value of c in (111.0.^4.6) and (III.D.41.8) where x-ay is replaced by 
x+ay . 

b . The Methodology 

Attempts to evalute the conditional density, as given by 
( III. D. 2.8) and (III. 0.2. 9), using the standard IMSL double integration 
routines failed. Even the IMSL routine OBLIN which is often successful 
in handling ill-behaved integrands, was unable to evaluate (III. 0.2. 8) 
around the singularities. For a < 1/2, along the lines a = 0 and a = 1, 
(III. 0.2.8) is unbounded. Similarly for a t 1/2, along the line a = 1 
and at the point (g,a) = (0,x/y) for 0 < x/y < 1, (III. 0.2. 8) is 
unbounded. Arbitrarily declaring (III. 0.2. 8) to be zero under these 
conditions did not always allow OBLIN to accurately evaluate 
(III. 0.2. 9) . 

We succeeded in evaluating the conditional density by 

working with the form given by (III. 0.2. 10) with f . ,„(a;a) given by 

A 

n 

(III.0.2.3) and f {(x-ay);1-a} given by (III. B. 2. 3). We used the IMSL 
e 

n 

routine OCAORE to construct an extensive table of values for the (1-a)- 
Laplace density with the intention to linearly interpolate from the 
table as needed. The error in the value of f^(|u|;1-a) in the table is 
controlled by OCAORE. The error in the value of f ^ ( | u^ | ; 1 -a) obtained 
by using linear interpolation for |Uq| not in the table is calculated in 
the standard way. From Gerald [Ref. 28: p. 168 ] 
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d^f (c; 1-a) 
e 



I Error Interpolation! = 



n 



(III.D.il.l) 



where h is subinterval length and s = (uQ-u)/h. Substituting the second 
divided difference into (III.D.M.1), in place of the unknown second 
derivative and also noting that the worst case for linear interpolation 
is at the center of the subinterval, we have 



where A^f is the second difference. Because f ( u ;1-a) is non- 
n n 

negative and monotone decreasing in juj, the largest values of A^f are 
in subintervals close to zero. The table that was constructed, 
therefore, uses smaller subintervals close to zero and larger 
subintervals further out. 



near the singularities, which we were able to evaluate analytically and 
then add back. The technique is often referred to as "removing the 
singularity" . 

c. Removing the Singularities Due to (III. D. 2. 3) 

We now describe how we evaluated the integrals in 

(III. D. 2. 10) in the vicinity of the singularities in (III. D. 2. 3). We 

1 /2 

see that the density of (a, 1-a) given in (III.D.2.3) is undefined at 
a = 0 and a = 1 for a < 1/2 and at a = 1 for a ^ 1/2. We also note from 
(III.D.2.3) that for small 6 > 0 and a < 1 /2 



I Error Interpolation! < 




(III.D.11.2) 



n 



Finally we used DCADRE again to evaluate (III. D. 2. 10) except 
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1/2 



(a;a) 



2a 



2a- 1 



- r(a)rd-a) 



0 < a < 6; 



(III.D.4.3) 



and for all 0 < a < 1 



f ,p(a;a) - , 1-6 < a < 1 . (III.D.'I.'I) 

~ r(a)r( 1 -a)( 1 -a 2 )“ 



Therefore for a < 1 /2 and 1 ^ x/y or x/y S 0 we have from (III.D.il.3) 



a=6 



a-6 „ 2a-1 

2a 



I f ^ {(x-ay);1-a}da - j r(a) f( 1-a) t ^ • 

a=0 ^ 



a=0 ^n 



(III. D. 11. 5) 



Since f (•) is continuous in this situation, there exists a number c so 
^n 

that 0 < c < 6 and |x| ^ |x-cy| ^ |x-6y| and 



a=6 



/ r(a)r(1-a)^e t - f^ {(x cy);1 a} | r(a)r(1-a) 



2a-1 



a = 6 



2a 



2a-1 



a=0 



a=0 



= f^ {(x-cy);1-a} 
^n 



(1-g)6^ 
r(2-a)r(1+a) ‘ 



(III. D. 11.6) 



A natural approximation for c allows |x-cy| to be the average, 
(1/2)|2x-6y|. 

For all a and 1 < x/y or x/y < 0 we have from (III.D. ii. il) 
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a=1 

f f ^ ^ { (x-ay) ; 1 -a}da 

, r A 

a=1-6 n 



a=1 



■ 1 
a=1 -6 



2a 



r(a)r(1-a) _2ia e 



f {(x-ay);1-a}da . (III. D. 4. 7) 

(l-a^)“ 



Likewise there exists a new number c so that 1-6 < c < 1 and 
|x-y| < lx-cy| < |x-y+6y| and 



a=l 



a=l-6 



1 



2a 



r(a)rd-a) 



(l-a^)“ S 



f { (x-ay) ; 1 -a}da 



a=l 



= f {(x-cy);1-a} f , > > ^ ( )da 

3 .= ) “6 



{(x cy);1-a} r(“ljr( 2 -a) * 
n 



(III.D.i).8) 



Again a natural approximation for c allows |x-cy| to be the average, 
(1/2) |2(x-y)+6y| . 

d. Removing the Singularity Due to (III. B. 2. 3) 

The final type of singularity occurs when 0 < a = x/y < 1 

and a 2 1/2. When this situation occurs we leave f (•) under the 

e 

n 

integral and argue that in a 6 -nei ghbor hood around x/y < 1, 
f = f Note that by the same argument that gave us 
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(III. D. 4. 6) and (III. D. 4. 8), there exist two numbers and so that 
(x/y)-6 ^ ^ x/y and x/y ^ ^ (x/y)+6 and 



a-x/y 

f f { (x-ay) ; 1 -a}da 

a=(x/y)-6 n 



a=(x/y)+6 

+ I f {(x-ay);l-a}da 

, A ^n 

a=x/y n 



a=x/y 

= f 1/2^°!’“^ I t(x-ay);1-a}da 



A ' / / \ r 

n a=(x/y)-6 



a=(x/y)+6 



+ f , /^(c.,;a) I { (x-ay);l-a}da. (III. D. 4.9) 



J/2' 2’ • j e 

A , n 

n a=x/y 



We chose to approximate and C 2 both by x/y for x/y # ± 1 , and have 

f .^{Y./y,a) < “ for all a. If x/y = 1 or x = 0 and y = 0 simultane- 
n 

ously, the value of (III. D. 2. 10) is undefined for a ^ 1/2. 

Now changing the variable of integration so that (x-ay) = u, 
we have from (III. D. 4.9) that for all a t 1/2 

a=x/y a=(x/y)+6 

I f^ { (x-ay) ; 1 -a}da = | f^ { ( x-ay) ; a}da , 

a=(x/y)-6 a=x/y 
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u = |y6l 

“ T?r / f'e^(u;1-a)du 

u=0 

S [^] , (III.D.4.10) 

because f (•) is a symmetric density. That is (III.D.4.10) is an 
e 

n 

expression for jyy P(0 < < ly^l) where is the ( 1 -a) -Laplace 

innovation random variable. Therefore, we add back to the DCADRE result 
the amount 



< |y6l)} ^ f < « , y * 0. 

^ ^ (III. D. 4. 11) 

We choose the following combination as the value for 
P(0 < < |y6l ). 

i) Using the trapezoidal rule and the table of values for 
the ( 1 -a) -Laplace density we found 



P/0 < < |y6|) 



1 /2 



u=M 



I 

u=ly6| 



f (u;1-a)du . 
e 

n 



(III.D.4.12) 



Equation (III.D.4.12) is the average of the upper and lower Riemann sums 

of the tail of the density subtracted from 1/2. Using (III.D.4.12) 

instead of directly integrating f (u;1-a) from zero to jyd] is 

n 

preferrable, because for a ^ 1/2, f (0;1-ot) is undefined. The error in 

£ 

n 
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( III.D. M. 1 2 
approximately 



from using 

-J2 



the trapezoidal rule 
t h 

in the i^ subinterval. 



approximation is 
Even though there 



n 

are over ^00 subintervals, the second differences A^f^(i) are very much 
smaller for a ^ 1/2 in the interval [ly6[,M]. 

ii) A second measure of P(0<e^<|y6|) is the lower sum 



P2(0 < < |y6l) = |y6lf^ {(y6);1-a}, (III.D. 4. 13) 

^n 



since P(0 < < |y^l ) always at least as large as (III.D. 4. 1 3) . Our 

approximation for P(0 < < [yfij ) is the maximum of (III.D. 4. 12) and 

(III.D. 4. 1 3) . We use the maximum because P^ given by (III.D. 4. 12) could 

be negative when |y6l is close to zero. This follows because F (u;1-a) 

' ‘ e 

n 

is strictly decreasing for u > 0, and thus the trapezoidal rule over- 
estimates the integral in (III.D. 4. 12) . 

E. PARAMETER ESTIMATION IN THE BELAR(I) PROCESS 
1 . Introduction 

In this section, we develop estimators for the parameters in the 

BELAR(I) process and report results on properties of these estimators 

obtained from analytical comparisons and simulations. We examine 

estimators for the location parameter, u, and the scale paramter, X, 

of the series {X }; the parameter, a, of the random coefficient 
1 /2 

A^ (a,1-a); and Y, the lag-1 serial correlation, which is a monotone 
function of a. 
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The theory of conditional least squares estimation for the 

BELAR(l) process using the linearized residual is derived using results 

from Nicholls and Quinn [Ref. 16]. We give a corollary to their Theorem 

3.1 pertaining to the strong convergence and asymptotic Normality of the 

least squares estimator of Y, the lag-1 serial correlation. An estimate 

1 /2 

for a is derived using the fact that Y = (a,1-a)}. Also, we show 

that the joint least squares estimator of location and correlation for 
the BELAR(I) process is the same as for the linear AR(1) processes. 

Other estimators of lag-1 serial correlation in the BELAR(I) 
process are derived using the ideas of robust estimation of Huber 
[Ref. 37] and least absolute deviation (LAD) estimation as applied to 
ordinary linear autoregressive models by Denby and Martin [Ref. 38] and 
Bloomfield and Steiger [Ref. 39]. Although these estimators are 
consistent and asymptotically unbiased in linear models, for the random 
coefficient models the results of the simulation study show that they 
have a bias that does not go to zero asymptotically. 

The maximum likelihood estimator of a, S,„ „ , is found using an 

MLE 

iterative technique with the initial estimate being derived from the 
least squares estimate of serial correlation, Y, _ . 

Many of the simulations comparing the different estimators are 
conducted within the framework of SIMTBED [Ref. 15]. From the Summary 
Statistics table generated by SIMTBED for each estimator, it is possible 
to draw conclusions concerning the bias, the variance at different 
subsample sizes, the asymptotic variance, and how fast the estimator 
approaches asymptotic Normality. In the SIMTBED program, one can 
specify the total number of samples examined at each subsample size. 
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The total number of samples used is the product of three parameters, N, 



M, and NSR. Three combinations of these parameters were used. Table 
III. E. 1.1 is a summary of the number and types of subsample sizes, N, 
and the number of independent repetitions, M, of each type of simulation 
conducted using SIMTBED. 

TABLE III. E. 1.1 
Summary of SIMTBED Types 



Number of 









Subsample 


Sizes 


(N) 








Super 

Replicati 


Type 


25 


50 


75 


1 00 


125 


175 


250 


500 


(NSR) 


I 


2000 


1000 


660 


500 


400 


280 


200 


1 00 


5 


II 


ilOOO 


2000 


1 330 


1000 


800 


570 


400 


200 


1 0 


III 


8000 


4000 


2660 


2000 


1 600 


1 1 40 


800 


400 


1 0 



Each entry in a Summary Statistics table, which is the output of 
SIMTBED after super replication, is a pair correspondi ng to a mean 
(average over the number of super replications, 5 or 10) and an 
estimated standard deviation of that mean value. Fran Table III. E. 1,1, 
it is clear that a large number of independent realizations was used in 
the computation for each super replication and the different subsample 
sizes. Because of this, subsequent tests of hypothesis that we use on 
the simulation outputs will be t-tests on the mean of a random sample of 
size 5 or 10 drawn from a Normal population where o* is unknown, but is 
estimated from the sample. 

Before describing each estimator and simulation experiment, it 
is convenient now to summarize the conclusions of this investigation 
into the estimation of parameters in the BELAR(I) process: 
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a. The simulation results from SIMTBED indicate that both the sample 
median and sample mean are asymptotically Normal estimators of y. 
The asymptotic variance of the sample mean is approximately twice 
that of the sample median across all values of the correlation 
coefficient, Y. 

b. The simulation results from SIMTBED also indicate that the mean 
absolute deviation, given in (II.E.3-2), is an unbiased and 
asymptotically Normal estimator of the scale parameter, X. It 
also has the smallest asymptotic variance of the three estimators 
considered . 

c. The least squares estimator of Y, the lag-1 serial correlation is 
asymptotically unbiased and Normally distributed. Simulation 
results support this conclusion. 

d. Simulation of other estimators of lag-1 serial correlation based 

on non-linear residuals of the form R = X -YX , + 6f(X ,X ,) 

n n n-1 n n-1 

indicates that the value of (Y,6) that maximizes the sum of 

squares of R is approximately (Y, _,0). 

n Lio 

e. Robust estimators of serial correlation based on certain symmetric 
loss functions of the linear residual (other than the sum of 
squares) are biased and, apparently, asymptotically biased. 
SIMTBED outputs of the Huber(c), rank and LAD estimators of lag-1 
serial correlation clearly exhibited this result. 

f. The maximum likelihood estimator of Y, the lag-1 serial 
correlation was computed by the iteration scheme given in Section 
III.E.6 for simulated data from the BELAR(I) process. Results of 
the simulation appear to indicate that the estimator is converging 
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to a Normal distribution with a mean value equal to the true Y. 
In comparison to the least squares estimator, the simulation 
results indicate that the maximum likelihood estimator has a 
smaller variance and bias at all values of Y. 

2. Estimators of Location 
a. Introduction 

The sample median, m, and the sample mean, X, are two 
commonly used estimators of the location parameter, u, in a stationary 
process with a symmetric marginal distribution. The sample median is a 
particularly attractive alernative to X when the symmetric distribution 
is also thick-tailed. (It is well known that for i.i.d. processes with 
a double exponential marginal distribution that the sample median is the 
maximum likelihood estimator of y) . 

For i.i.d. processes, it is well known (Dudewiez, [Ref. MO: 

p. 221]) that X has an asymptotically Normal distribution, N(0,/o^/n). 

Likewise, m is asymptotically Normal, N{0, /l/Unf^Cx The results 

A . b 

for the sample median hold provided f^(x ,. ) is continuous in a 

X . b 

neighborhood around x is positive, and is bounded above. 

.b 

The problem of estimating y from dependent data is more 
difficult. Analytical results exist about the limiting distribution for 
X in ergodic processes and for the sample median for processes 
satisfying certain mixing conditions. (Mixing processes are those for 
which random variables "sufficiently far apart" are approximately 
independent) . 
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Since the BELAR(I) process is an RCA(1) process with i.i.d. 
innovation aind random coefficient processes, is ergodic (Nicholls 

and Quinn [Ref. 16: p. 37]). Therefore X is still an unbiased 

asymptotically Normal estimator of p, but the variance is modified by 
the factor 



1 + 2 I = (1+Y)/(1-Y). (III. E. 2.1) 

k=1 

See, for example. Priestly [Ref. 33: p. 3^3]. 

The problem of estimating the median has been studied for 
cases where the data are dependent. From Heidelberger and Lewis 
[Ref. *11)], we have that the usual order statistic point estimate 
(sample median) is still valid, but the variance is modified by a 

factor, p(x ^). Here p(x _) is the initial point on the spectrum of the 

. b . b 

binary process {I (x _)}, where 
n . b 



1 if X < X, 
n 

0 otherwise. 



(III. E. 2. 2) 



That is 



n 

p(x ^) = lim n Var { ^ I.(x .)/n}. (III. E. 2. 3) 

.b n^“ 1 .b 

As was already pointed out, conditions for convergence and 
Central Limit Theorems for the sample median depend on mixing 
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conditions. There are several kinds of mixing conditions. It is not 
known, however, if the BELAR(I) process satisfies any of them. 

However, the LAR( 1 ) process does satisfy the mixing 
conditions of Gastwirth and Rubin [Ref. n]. Thus, for the LAR(1) 
process, it is known that the sample median has an asymptotic Normal 
distribution with mean zero, and variance given by 

I {yl'^lcoshCx + sinhCx (III. E. 2. 4) 

k=-oo -5 .5 U-YJ 

Gastwirth and Rubin [Ref. 14] showed that for the LAR(l) 
process, the asymptotic variance of X is twice that of the sample median 
across all values of serial correlation. 

The question here is, what are the properties of the sample 
median in estimating p from data of the BELAR(l) process? Also, how does 
the sample median compare to X in the BELAR( 1 ) process? 

Since from both the BELAR( 1 ) and the LAR(1) processes 

have a marginal Laplace distribution and first-order autoregressive 
correlation structure, the hypothesis is that the sample median from the 
BELAR( 1 ) process behaves similarily to that generated from data in the 
LAR(1) process. Also, the relative efficiency of m to X is the same in 
the two processes. 

To substantiate this assumption, the sample median and 
sample mean were compared in simulation experiments in SIMTBED for data 
generated from the BELAR( 1 ) process. The simulation output is compared 
to the theoretical results for the LAR( 1 ) process. 
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b. Simulation Results 



For a = .1 and a corresponding correlation coefficient of 
Y = . 1 7664, the estimators X and m were simulated in SIMTBED using a 
size of Type III from Table III. E. 1.1. The results are given in the 
Summary Statistics in Table III. E. 2.1. Looking at Table III. E. 2.1 for 
N = 100 and greater, there is no evidence of non-Normal ity from the 
first four estimated moments of the sample mean. The leading 
coefficient in the asymptotic expansions for E(X) and Var(X) do not 
deviate significantly from the theoretical values, i.e. X is unbiased 
and Var(X) = 2.8581 /N. 

Looking at Table III. E. 2.2, the Summary Statistics af a = .1 
for m, it appears that even for N = 25, m is unbiased and the sample 
skewness is fluctuating about zero. The variance, however, at each 
subsample size up to N = 250 deviates significantly from a hypothetical 
asymptotic variance of 1.4291/N, the corresponding result for LAR(1). 
This is explained by the kurtosis of the estimate m of the median which, 
although decreasing with increased subsample size, is still 
significantly different from 0 until N = 250. The leading coefficients 
in the expansions for the expectation and for the variance are not 
significantly different from 0 and 1.4291 respectively. Since the data 
are only slightly correlated, we could have expected the sample median 
to behave similarily to that of the case of the completely random 
process with Laplace marginals, i.e. m is unbiased, asymptotically 
Normal, and has a variance with leading coefficient 1/n. 
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REGRESSION ON VARIANCE - COEFFICIENTS: iilSl?! 1 i 6 ? 53 i 



SIMTBED Summary Statistics for Estimating y by m in the 
BELAR(l) Process with a=.l and y=* 17664 
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For values of a = .5 and with corresponding Y » .63662 

and .89986, using Type II experiments as described in Table III. E. 1.1, 
we again compared the behavior of X and m. 

From Tables III. E. 2. 3 and III.E.2.^, we see that the 
behavior of X is as expected. The sample mean appears to be unbiased. 
For N ^ 250, there is no evidence of non-Normal ity . The estimates of 
the leading coefficient in the asymptotic expansions for the variance 
agree within one standard deviation of the postulated values of 9.0 and 
38. 

The corresponding results for m are given in Tables 
III. E. 2.5 and III. E. 2.6. The sample median shows no bias and appears to 
be asymptotically Normal after N ^ 250. In each case (a = .5 and 
a = .8^1)) the leading coefficient in the expansion for the variance is 
smaller than the corresponding value for the variance of the sample 
median in the LAR( 1 ) process, i.e. ^.5 and 19 respectively. 

The analysis thus far has indicated that at least for data 
with non-negative correlation in the BELAR( 1 ) process, there is little 
evidence to suggest that the behavior of the sample median is 
significantly different than in the LAR( 1 ) process. From Table 
III. E. 2. 7, we see the same kind of results that Gastwirth and Rubin 
[Ref. 1i|] reported. As sample size increases, the efficiency of X 
relative to m drops to 50 %. 
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SIMTBED Summary Statistics for Estimating y by X in the 
BELAR(l) Process with a=.5 and y=* 63662 
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REGRESSION OM VARIANCE - COEFFICIENTS: 



SIMTBED Summary Statistics for Estimating y by X in the 
BELAR(l) Process with a=.844 and y=- 89986 
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REGRESSION ON VARIANCE - COE Ff I Cl ENf S; 



SIMTBED Summary Statistics for Estimating y by m in the 
BELAR(l) Process with a=.5 and y=. 63662 
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ESTIMATOR: SAMPLE MEDIAN; M0»0.0 

•***•• WIDEST Y VALUES FOUND: YMIN=*-3.87S YHAX- 2.972 
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RtGRESSIOM ON VARIANCE - COEE f I Cl ENTS: 5!5:JSJ 



TABLE III. E. 2. 7 



Efficiency of X Relative to m in BELAR( 1 ) for Y > 0 



N 


Y = +. 1766^ 


Y = +.63662 


Y = +.9 


25 


.6H 


.69 


.98 


50 


.58 


.58 


.81 


75 


.55 


.55 


.73 


1 00 


.54 


.52 


.67 


125 


.52 


.50 


.62 


175 


.53 


.il9 


.57 


250 


.51 


.^7 


.53 


500 


.50 


.^7 


.H8 


1 . For Y = 


+.1766 the results 


are based on a Type 


III experir 



For the other 
experiments . 



two cases, the results are based on Type II 



We also simulated X and m for negatively correlated data 
from the BELAR(I) process. Type III simulations were used for X and m 
at Y = -.63662 and a Type II simulation for X at Y = -.9. From the 
Summary Statistics for X in Tables III. E. 2.8 and III. E. 2. 9, we see X is 
unbiased and approximately Normal for sample sizes greater than 125. 
Estimates for the coefficients for the asymptotic variance are not 
significantly different from the theoretical values of and .1053- 

From Table III. E. 2. 10, the most obvious point to be made is that 
even for moderately negatively correlated data, m is not Normally 
distributed even for subsamples of size 500. The sample median is 
unbiased, but the kurtosis is not decreasing fast enough. The variance 
of the sample median even at N = 500 is almost certainly not 
( 1 /N) (1 +Y/1 -Y) . However, the leading coefficient in the expansion for 
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the asymptotic variance is within a standard deviation of the 
hypothetical values ( 1 /N) ( 1 +Y/1 -Y) . This would indicate, for the case 
of negative correlation, a much slower convergence of the sample median 
to Normality than for positively correlated data. 

For negatively correlated data from the BELAR( 1 ) process, we 
have observed that X does not lose efficiency relative to m as fast as 
for non-negatively correlated data. In fact, from Tables III. E. 2. 8 and 
III. E. 2. 10, it is clear that the variance of X is smaller than m for 
subsample size N ^ 100. 

3. Estimators of Scale 
a. Introduction 

In the case of estimating the scale parameter, X, we 

considered three estimators. Since Var(X ) = 2X^, we considered 

n 

X^ = S// 2 where 






I 

N 



N 

I 



i=1 




X)". 



(III.E.3. 1 ) 



Since the maximum likelihood estimator of X for an i.i.d. sample with 
marginal Laplace distribution is the sample mean absolute deviation 
about the median, we set 

. N 

^2 ° N ^ l^i ■ '"I • (III.E. 3.2) 

i=1 
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SIMTBED Summary Statistics for Estimating y by X in the 
BELAR(l) Process with a=.5 and y=~. 63662 
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ESTIMATOR: SAMPLE MEAN; MU-0.0 

WIOEST y VALUES fOUHO: YMIM--.N513 YMAX-0.4718 



SIMTBED Summary Statistics for Estimating y by m in the 
BELAR(l) Process with a=.5 and y=~*^3662 
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REORESSIOM OH VARIANCE - COEfF 1C I ENTS: SJ^^Si^E-OI 0?)}??^ ‘2!j3|}i W.stU 



As a third alternative, we chose the scaled median absolute deviation 



about the median, 



med( 



IX. - 



m 



( .69315 



(III. E. 3. 3) 



The scaled median absolute deviation is a frequently used robust 
estimator of scale [Ref. 38]. In the simulations, we assumed that 
are Laplace with median = mean = 0 for all n. Table III. E. 3.1 contains 
a summary of the type simulation (as defined in Table III. E. 1.1), the 

^ A ^ 

estimator (X values of a and Y that were used. 

TABLE III.E.3. 1 



Summary of Simulation Schedule for Estimators of X 



Y 



a 

Estimator 



X 

A 

X 

/s. 

X 



1 

2 

3 



-.89986 .17664 .63662 
.844 .1 .5 

Type II Type III Type I 
Type II Type III Type I 
Type II Type III Type I 



b. Simulation Results 

In the Type III simulation (See Tables III.E.3. 2 - 
III.E.3. 4), using slightly correlated (Y = . 17664) realizations of the 

A 

BELAR(I) process, we found the best estimator of X to be X^, the sample 
mean absolute deviation. It appears to be unbiased for all subsample 
sizes. The skewness and kurtosis are decreasing with increased sample 
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SIMTBED Summary Statistics for Estimating X by X^ in the 
BELAR(l) Process with a=.l and y=* 17664 
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REGRESSION ON VARIANCE - COEFFICIENTS: o!lS883S 
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RtCfltSSlOH OM VARIAHCE - COEf f I C I EMTS: 



SIMTBED Summary Statistics for Estimating X by X3 in the 
BELAR(l) Process with a=.l and y =- 17664 
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si zes . 



But. even for N = 500, the skewness is still significantly 



different than 0. Using two-sided t-tests with 18 degrees of freedoa 
for the equality of means of two Normal populations with unknown 
variances at the 90J confidence level, we reject each of the hypotheses 
independently that: (1) Var(X^) = VarCX^) and (2) Var(X^) = Var(X^). 

The mean relative asymptotic efficiency of X^ and X^ to X^ are estimated 
from the regression on variance coefficients to be 76$ for X^ and 60$ 
for X^. 

A A 

Both X^ and X^ appear from the simulation to be biased. 
From the second coefficient in the mean of regression on average in 
Table III. E. 3-2, X^ appears to have a negative bias of orcer(IA'). From 
Table III.E.3.^ it appears that X^ has a positive bias of order (1/N). 
However, since the leading term in the expansion of t.he mean for both 
estimators is the true value of 7, it appears that both X^ and X^ are 
asymptotically unbiased. 

When we considered moderately to highly correlated data (see 
Tables III. E. 3. 5 - III . E. 3. 10) , the differences in the behavior of the 
estimators were not as easy to discern. The particular bias of X, and 
X^ was even more apparent, especially at the smaller subsample sizes. 
As [t] increased, so did the expressions for the asymptotic variances. 
At each of the subsample sizes, in both types of correlation, X^ had the 
highest estimated variance. The variance of X^ was significantly 
different than that of X^ at all levels of significance and subsample 
sizes up to N = 500. However, we could not reject that the as>'mptotic 
variances of X^, X^ and X^ were the same at each of the two levels of 
correlation . 
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SIMTBED Suitunary Statistics for Estimating X by X^ in the 
BELAR(l) Process with a=.5 and y=*63662 
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REGRESSION ON VARIANCE - COEFFICIENTS: 0?)5!32X ISiUSl ?J3:82S 



SIMTBED Summary Statistics for Estimating X by X 2 in the 
BELAR(l) Process with a=.5 and y=- 63662 
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RECRESSIOM ON VARIANCE - COEmCIENlS: ohlUU ‘HiSSII 



SIMTBED Suininary Statistics for Estimating X by X 3 in the 
BELAR(l) Process with a=.5 and 63662 
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SIMTBED Summary Statistics for Estimating X by in the 
BELAR(l) Process with a=.844 and 
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REGRESSION OM VARIANCE - COEFFICIENTS: 



SIMTBED Summary Statistics for Estimating X by X 2 in the 
BELAR(l) Process with a=.844 and 
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SIMTBED Summary Statistics for Estimating X by X3 in the 
BELAR(l) Process with a=.844 and y=-. 89986 
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4 . Least Squares Estimation of Serial Correlation 



In this section 

has a standard Laplace 

X by 
n ^ 



, it is assumed, unless otherwise stated, that 
(p = 0,X = 1) distribution. If not, standardize 



X - p 

X’ = -5^7; , (III. E. 4.1) 

" X 

where p and X will be specified from those estimators already discussed 
in III.E.2 and III.E.3. 

The least squares estimator of the lag-1 serial corrlation, Y, 
is derived. First, we show that the BELAR( 1 ) process is an RCA(1) 
process of Nicholls and Quinn [Ref. 16 ]. Then, we define the linearized 
residual in the BELAR( 1 ) process and state some of its properties. From 
these properties and some results from Nicholls and Quinn for RCA 
processes, we derive the asymptotic properties of The properties 

of Yj^g are observed also in the simulation results for selected values 
of Y. Finally, the joint least squares estimator of location and serial 
correlation are derived for the BELAR(I) process. 

Rewriting (III. D. 1.1) by adding and subtracting YX^_^ , we have 

1 {Ay^(a,1-a) - Y}X„ , + e„ , (III.E.4.2) 

n n-i ^ n n-1 n 

where Y = E{A^^^(a,1-a)} as given by (III. C. 2. 3) for 

1 /2 

I = 1 (a,1-a) - Y} is an i.i.d. process stochastically independent 

of the i.i.d. {e }. The variance of the random coefficient is (a - Y^) 
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for all n. As can be seen from (III. C. 2. 5) and the fact that 0 < a < 1, 

if we know a, then we also know |y| and vice-versa. That is, in the 

BELAR(I) process, there is only one independent parameter to estimate 

for the correlation. Now, we recognize (III.E.il.2) immediately as an 

RCA(1) process of Nicholls and Quinn [Ref. 16 ]. Since and 

1 /2 

{A^ (a,1-a) - Y} are each identically distributed as well as being 

serially independent and independent of each other, we have by theorem 
2.7 [Ref. 16] that {X^} is the unique strictly stationary and ergodic 
solution to (III. E. 4. 2). 

There are two ways to look at the linearized residual in the 
BELAR( 1 ) process just as described in Chapter II for the NLAR( 1 ) model; 



From ( III. E. 4 , we see that since {X^} is strictly stationary, so is 

{R }. Also, we see E(R ) = 0 and Var(R ) = 2(1-Y^). Lawrance and Lewis 
n n n 

[Ref. 22] proved that the R^ are uncorrelated, but in general, not 

independent. From ( III. E. i|. 3) . we note that for any n, R^* unless 

a = 0. Except for when a = 0 or 1, Var(R ) > Var(e ). As a increases 

n n 

from zero to one, both Var(R ) and Var(e ) decrease raonotoni call y from 

n n 

two to zero. This is evident from the definition of Y in (III. C. 2. 5) 
with I = 1 . 



R = {A^^^(a,1-a) - Y}X + e , 
n n n - 1 n 



(III.E.il.3) 



or 




(III.E.il.4) 
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Two other properties of obtained from (III. E. 4.3) by 

conditioning on the independent, identically distributed processes t } 
1 /2 

and (a,1-a) - Y} up to time k = n - 1. We have 

E[R^|{ej^,Ay^(a,1-a) - Y} ; k = 1,2,...,n-1] 

= X ,E{A^^^(a 1-a) - Y} + E(e ) = 0, (III. E. 4. 5) 

n~l n n 

1 /2 

because {A^ (a, 1-a) - Y} and are independent of the process through 

time n-1 and is a function only of the process through n-1 . 

E[R^|{£k, A^[^^(a,1-a) - Y} ; k = 1,2,...,n-1] 

= E(e2) + .E[{Ay^(a,1-a) - Y}^] 

n n-1 n 

= 2(1-a) +x^_^(a-Y^), (III. E. 4. 6) 

which is only a function of a or Y^ alone, since a determines Y^ and 
vice-versa . 

Now using (III. E. 4. 4) and a given realization of {X^} of size n, 
n 

we minimize Y R? with respect to Y to obtain the conditional least 
i=2 ^ 

squares estimate for Y. This is the same procedure as described for the 
NLAR(I) process. We have 



> n n 

x.x. ,)/(^ X? ) 
LS \-2 ^ i-2 



(III. E. 4. 7) 
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Two problems can occur using (III.E.^.7), especially for small 
sample sizes. For the BELAR(I) process defined by ( II I . E. ^4 . 2) , 

Ak A. 

1 S Y ^ 0, and yet it is possible that Yj^g < 0 or ^ ^ 

-1 < Y, „ < 0, we would estimate that the sample {X } came from the 
LS n 

1 /2 

BELAR(l) process with the negative sign on A^ (a,1-a). If |y| > 1 , we 
would estimate Y by +1 or -1. 

In order to obtain the "least squares" estimate for a, we solve 
numerically for a in 

Lio 



I ^Lsl 



r(SLs^i/2) 



- 



(III. E. 4.8) 



for a given Y^g from (5.7) if < 1. 



LS' 



The estimator Yj^g given by (III. E. 4. 7) has the following 
properties which we state as a corollary to Theorm 3.1 [Ref. 16]: 



CORROLLARY. For {X^} given by (III. E. 4. 2); {R^} in (III. E. 4. 3) 
and (III. E. 4. 4), the least squares estimator Y, „ has the following 

Li O 

properties : 



a) Y^g-"-^--Y; 



b) Since E(X‘*) = 24 < «, ( N-1 ) ^ ^^ ( Y, „-Y) has a distribution 
n LS 

which converges to the Normal with a mean of zero and a variance o* 
given by 
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= 1+5a-6Y^. 






The proof follows from Theorem ( 3 • 1 ) • The strict stationarity 
and ergodic nature of leads to the almost sure convergence. The 

results of (III.E.^.5) and (III.E.M.6), together with Billingsley's 
Martingale Central Limit Theorem provide the results for the asymptotic 
Normality of 

A strongly consistent estimator for the variance, o^, is also 
given in [Ref. 16] for the general RCA(1) process. For in 
(III.E.4.9), this estimate becomes 



(n-1) 

n 



I X 

i=2 



2 

i-1 



n 






n 

I X? + 
^ 1 



(5 -Y' ) ^ 

^ LS LS^ i=2 

n 

I X? , 

i=2 



i=1 



(III.E.il.10) 



For large n (III. E.^. 10) is approximated by 












n 

I X^ , 

i=2 



(III.E.il.11) 



where is from (III.E.M.7) and a^g ( III. E. ^1. 8) . 

Simulations of the least squares estimator of Y were conducted 
for selected values of Y in SIMTBED using Type III plans. The results 
are summarized in Tables III.E.^.1, III.E.M.2 and III.E.^.3. The 
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results reflect the theoretical behavior of the estimator as derived 
above . 

We note that the joint conditional least squares estimators of p 

and Y in the BELAR(l) process are the same as in the linear AR( 1 ) 

n 

processes. Minimizing the sum 'I R? where now 

i=2 ^ 

R^ = (X.-p) - Y(X.^^-p), (III. E. 4. 12) 

leads to the following joint estimators for p and Y 



n .s n 

y = ( I X - Y I X ) / (n-1)(1-Y), (III. E. 4.13) 

i=2 i=2 



Y 



I (X -p)(X -p) 

i=2 



/ I (X -p)^ 

i=2 



(III. E. 4. 14) 



For large n these equations reduce to the familiar ones 



p = X (III. E. 4. 15) 

n _ _ n _ 

Y = I (X -X)(X -X) / I a -X)K (III.E.4.16) 

i=2 i=2 

We now turn in the next section to the question of alternative 
estimators for Y given that p = 0 and X = 1 . 
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5. Other Estimators of the Lag-1 Serial Correlation 



a. Estimators Based on a Non-linear Residual 

In this section, we explore other possibilities for 
estimating Y in the BELAR(I) process. There is a question as to why one 
should use the linear residual since the BELAR(1) process is a random 
coefficient process which is non-linear. Secondly, why should you 
minimize the square of the linear residual as opposed to minimizing some 
other symmetric loss function which is a function of the linear 
residual? The answer to both questions is that the least squares 
estimator of Y based on the linear residual out- per f or med other 
estimators in the simulation experiment. 

Consider the following types of non-linear residuals 

* 2 

R = X -YX -6(X -2), (III. E. 5.1) 

n n n-1 n 

R* = X -YX ,-6X^ ,Sign(X J. (III. E. 5. 2) 

n n n-1 n-1 ^ n-1 

From ( III. E. 5.1), it follows that R has zero mean and 

n 

Var(R*) = 2(1-Y^+10e^) , (III. E. 5. 3) 

Cov(R*,R* J = 20aB^. (III. E. 5. 'I) 

n n- \ 

* 

Introducing the extra parameter, 6, makes the residuals, R^, correlated 
unless a = 0 or 6 = 0 . If B is zero, then we again have the usual 
linearized residual in (III. E. i<) . If B* = Y^/10, then the variance is 
a constant, but the residuals are still correlated. It is easy to 
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compute the least squares estimators for Y and S from (III. E. 5.1) and 
(III. E. 5. 2). We simulated the estimators of Y and B and compared them 
to the results based on (III. E. 4. 4) with B = 0. From Table III. E. 5.1, 
we see that the different estimators of Y from all three residuals are 
close to the true Y. The result is that the estimates of B are very 
close to zero. 

To see how much the value of Y could change with B fixed at 
some non-zero values, we simulated the least squares estimator of Y with 

B = 0 and the estimator of Y based on (III. E. 5.1) with B = Y/ 1 0 and 

again with B = -Y// 10. From Table III. E. 5. 2, we see that B ^ 0 

severely alters the estimate of the serial correlation. Therefore, in 

the remainder of this subsection, we consider alternative estimators for 

Y in the BELAR(I) process to be only those based on the linear residual. 

b. Estimators Based on the Linear Residual, R 

’ n 

Besides the asymptotically unbiased least squares estimator, 
we considered the following well-known estimators of Y in linear AR(1) 
models: 

1) The Huber(c) function as described by Denby and Martin [Ref. 38]. 

The estimator, Y , is the value of Y that satisfies the 

rl 

equation 



n 

I X, .'l'„(x,-Yx, . ) = 0, (III. E. 5. 5) 

^ 1“ I n 1 1“ I 
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TABLE III. 3. 5.1 



Simulation Results for Various Definitions of R 

n 



in BELAR(l) 



1 . 


N = 500 




a = .5 


Y = 


Corr(X ,X ,) 
n n-1 


= ,63662 


DATA 


A 

^LS 


6 = 0 


Y 


6 


Y' 


6' 


XI 


.56891 


0 


.57192 


.00279 


.62082 


-.01771 


X2 


.61996 


0 


.61 630 


-.0081 5 


.56051* 


+.01 637 


X3 


.62651 


0 


.62601* 


.00358 


.78189 


-.05808 


XI* 


.57995 


0 


.58371* 


-.01865 


.75716 


-.07208 


X5 


.59236 


0 


.59233 


-.021 00 


.70995 


-.01*71*8 


AVG 


.59751* 




.59807 


-.00829 


.68607 


-.03580 


STD 


.021*99 




.02257 


.01 151* 


.09330 


.03535 


BIAS 


-.03908 




-.03855 


-.00829 


+ .01*91*5 


-.03580 



2. 


N = 1000 




a = . 5 


Y = 


Corr(X ,X , ) = 
n ’ n-1 


.63662 


DATA 


A 

^LS 


6 = 0 


Y 


6 


Y' 


6' 


Y1 


.63026 


0 


.62955 


-.001*23 


.62985 


.0001 3 


Y2 


.671*22 


0 


.65653 


.02520 


.59178 


.03095 


Y3 


.62566 


0 


.62921 


-.00590 


.5961*6 


.01093 


Yl* 


.67738 


0 


.67777 


.00233 


.60522 


.02359 


Y5 


.61*661* 


0 


.61*781* 


-.00560 


.6281*1 


.00581 


AVG 


.65083 




.61*818 


.00236 


.61031* 


.01 428 


STD 


.021*1 1 




.02032 


.01 320 


.01782 


.01 273 


BIAS 


+ .01 1*21 




+ .01 156 


+.00236 


-.02628 


+.01 428 



3. 


N = 1500 




a = .75 


Y = 


Corr(X ,X J 
n n-1 


= .83463 


DATA 


yv 

^LS 


6 = 0 


Y 


6 


Y' 


yv 

6' 


Z1 


.81 183 


0 


.81 671 


.00797 


.86364 


-.01821 


Z2 


.80699 


0 


.80700 


-.00040 


.82072 


-.00511 


Z3 


.81777 


0 


.81795 


-.001 60 


.83399 


-.00641 


Z4 


.85279 


0 


.85569 


-.00728 


.891 1 6 


-.00193 


AVG 


.82235 




.82434 


-.00033 


.85238 


-.01 041 


STD 


.02077 




.021 47 


.00629 


.03147 


.00598 


BIAS 


-.01 229 




-.01029 


-.00033 


+ .01775 


-.01041 
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TABLE III. E. 5. 2 

Simulation Results for Various Definitions 

of R to Estimate Y Given 6 in BELAR(I) 
n 





N = 500; 


a = . 5 


Y = .63662 






DATA 


(^Ls|6 . 0) 


r"* 

[y 


B = ^ ) 
/To 


r"* 

[7 


e = ■ 


1 


.56891 




.27552 


.27410 


2 


.61996 




.21515 


.26257 


3 


.62695 




. 38621 


.38450 


4 


.57995 




.34356 


.39730 


5 


.59236 




.36082 


.40557 



where 



»„(t) 



t if |t I ^ c, 

c Sign (t ) if |t I > c . 



(III. E. 5. 6) 



The corresponding weight function w (t) is f„(t)/t and c is 

rl n 

a tuning constant. As c goes to infinity 'f„(t) approaches t and Y„ is 

n rl 

the least squares estimator of Y. If c = 0, we have the solution of 
(III. E. 5. 5) is the median of 

For c other than 0 or «, there is no closed-form solution to 
(III. E. 5. 5). We obtain the Huber(c) estimator of Y by iterating the 
following scheme: 



y x.x. , w„ 
i-2 ^ " 



x.-Y.x. . 
1 k 1-1 



k+1 



n fX.-Y.x. , 'i 



(III. E. 5. 7) 
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where is the least squares estimator of Y and 

median |X . I 

S = -JlL 

r .69315 



(III. E. 5. 8) 



is the scaling constant for the R, . If Y = 0, then S is the median 

i r 

absolute deviation estimator of the scale parameter in the Laplace 
distribution as given in Section III.E.3. Typical values of c are 1, 
1.5, 2. We use for illustration c = 1 in the simulation along with , 
the least squares estimate, and Y^^, the median (X^/X^_^). 

2) The Least Absolute Deviation (LAD) estimator of Y is the minimizer 
of 



n 



I I 

i=2 



x.-Yx 

1 



i-1 



(III. E. 5. 9) 



The solution is, , the weighted median of where 

the weights are x^_^ for i = 2,...,n. 

Denby and Martin [Ref. 38] reported that the Huber(c) 
estimates are consistent and asymptotically unbiased for linear AR(1) 
models. Bloomfield and Steiger [Ref. 39] showed that the LAD estimator 
is strongly consistent and asymptotically unbiased for linear AR( 1 ) 
models. In Figures III. E. 5.1 - III. E. 5. 4 are examples from SIMTBED of 
the behavior of these estimators in simulated data from LAR(1), a linear 
AR(1) model with Laplacian marginals and AR(1) correlation structure 
given in Chapter II. These results appear to be consistent with the 
results reported above for linear AR( 1 ) processes. The leading 
coefficient in the expansion for the mean of each estimator does not 
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III. E. 5. 2. SIMTBED Boxplot Analysis of Weighted Median Estimator of y 

with Y=* 63662 in the LAR(l) Process 
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.63662 in the LAR(l) Process 



differ significantly from the true value, 0.63662. We also see that the 
median (X^/X^_^ ) and the weighted median (X^/X^_^ ) estimators are 
considerably more efficient than either the Huber (c) estimator in Figure 
III. E. 5. 3 or the least squares estimator (c = ») in Figure III.E.5.^. 

Since the least squares estimator remains asymptotically 
unbiased for the BELAR(l) process as was shown in Section III.E.4, it 
was of interest to observe how the Huber (c) estimators, c < ®, and the 
LAD estimator of Y would behave. Considering the ordering suggested by 
the simulation results in the LAR(l) process, it would seem possible 
that the Huber (c) estimates could be better than the least squares 
estimator of Y. In the boxplot analyses in Figures III. E. 5. 5 - 
III. E. 5.8 are the results of the simulation for Y = .63662, but for 
data from the BELAR(l) process. The boxplots in Figure III. E. 5. 5 
display the theoretical behavior of the least squares estimator of Y. 
The other estimators of Y appear to be converging to other values 
Yq * Y. To see this, note the first entry in the coefficients for the 
asymptotic expansion of the mean of Y in Figures III. E. 5.6 - III. E. 5. 8. 
In each case Yq > Y. Also from the estimate of the standard deviation, 
we assert that Yq is significantly larger than Y for each of the 
alternative estimators investigated here, because the difference, 
|y - YqI , is larger than four standard deviations. 

For the BELAR( 1 ) process, we observe a reversal from the 
LAR(l) process in preference for the estimator of Y. We will use the 
least squares estimator as the initial estimator of Y in the iterative 
procedure for finding the maximum likelihood estimator of Y which we 
develop next. 
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SAMPLE SIZE (N):20000 NO. OF REPLICATIONS 10 
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.63662 in the BELAR(l) 



SAMPLE SIZE (N):20000 NO. OF REPLICATIONS (M): 10 DEGREE OF REGRESSI ON_( 0) 
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III. E. 5 , 7 . SIMTBED Boxplot Analysis of Median (X^/X^_j^) Estimator of y with 
and Y =» 63662 in the BELAR(l) Process 
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6. Maximum Likelihood Estimation of Y 



a. Introduction 

In this section, we develop the maximum likelihood estimator 

of the lag-1 serial correlation in the BELAR( 1 ) process, Y We use 

the expression for the logarithm of the likelihood function, L(a) , in 

(III. D. 2. 12) in an iterative procedure to find the values of a and the 
1 /2 

sign of (a,1-a), that minimizes -L(a); call the pair sign). 

1 /2 

Since knowing a and the sign of A (a,1-a) uniquely defines Y, Y^., _ can 

n rlLb 

be found from (III. E. 4. 8) using (a,„_, sign). 

MLE 

We consider only the univariate problem. That is, we have 
assumed that is marginally Laplace distributed or have determined 

from Q-Q plots that the best Jl-Laplace fit to the data is when X, = 1 . 
Secondly, we assumed that is standard Laplace (p = 0; X = 1) or 

that {X^} has been standardized using a pair of estimators (p, \) from 
Sections III.E.2. and III.E.3. 

As a function of a, (III. D. 2. 12) is very complicated. There 
is little hope of being able to analytically solve for the critical 
values of a. In fact, the evaluation of a derivative of (III.D.2.12) is 
at least as expensive computationally as the function values themselves, 
since (III.D.2.12) contains exponential functions of a. However, since 
this is a one-dimensional optimization problem, there are IMSL routines 
that will perform the search without using deviati ves--Golden Section 
search; bisection method; or interpolation routines. 

We chose the IMSL routine ZXLSF which performs a one- 
dimensional search for a minimum of a smooth function in a closed 
interval using quadratic interpolation. The FORTRAN routine which 
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evaluates ( III. D. 2. 12) is formulated so that ZXLSF is searching on the 
interval (-1,1) where a < 0 implies that conditional densities of the 
form (III. D. 2. 10) are being evaluated instead of those given by 
(III. D. 2. 9) when a > 0. The initial value for a to start the iteration 
procedure-of ZXLSF is a four-digit approximation (S,„, sign,„) 
corresponding to the least squares estimate of serial correlation, 
obtained from (III. E. 4. 8). 

The queston of accuracy in the calculation of (III. D. 2. 12) 
is especially important because the likelihood surface is extremely flat 
in many cases. We want some assurance that ZXLSF is efficiently 
searching for the optimum and not "chasing roundoff errors". This 
happened before we increased the accuracy parameter in DCADRE and used 
double precision. In order to assess the accuracy of our calculations, 
we constructed first- and second-divided differences for values of a and 
( 1 1 1 . D . 2 . 1 2) . The divided differences are approximations for the 
derivatives. For those simulations that we checked, there was one 
transition of the slope through zero at the critical point found by 
ZXSLF. The second-divided differences at all points in the vicinity of 
the critical value were positive indicating the general convex upward 
shape of ( III.D. 2. 1 2) . Sometimes there was some fluctuation in values 
of the second-divided differences, but no change of signs near the 
reported optimum. 

The fluctuating values of the second-divided difference 
indicated some noise in the calculations. This occurred in two places. 
If the s eco nd- di vi ded difference covered points on both sides of 
a = 1/2, then there was often a jump in the value of the second-divided 
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difference. This occurred because of the change in the method of 
calculating the conditional density when a changed from a < .5 to 
a S .5. Other times, slight aberrations in the observed pattern of the 
second-divided differences occurred for values of a that were small, 
0 < a < .15. This is attributed to the fact that DCADRE evaluations for 
the table of values of the ( 1-a)-Laplace density (0 < a < .15) in many 
subintervals was not behaving regularly. The computed value was 
accepted because the estimated error was small, relative to the accuracy 
requirements. The important consideration, however, was that no error 
in calculating (III. D. 2. 12) should be so large as to falsely indicate a 
change in convexity in the vicinity of an extremum, so that ZXLSF would 
be ineffective at locating it. 

The selection of a good starting point in this procedure is 
also important. It is desirable to commence the iteration in ZXLSF as 
close to the global optimum as possible in order to reduce the 
possibility of converging to a local optimium. Note, also, that as a 
function of a, the conditional density is not necessarily convex and 
often is not even unimodal across the range from Y = +1 to Y = -1 . 

Since (III.D.2.12) is the logarithm of the product of such 
functions, there is no assurance that (III.D.2.12) has a single relative 
maximum especially for small sample sizes. When the sample size is 
small, it is advisable to pick a starting value for the iteration on 
both sides of a = 0. Select the maximum likelihood estimator to be the 
one with the higher value of L(a) if the routine produces two different 
a’s, corresponding to the pairs (a^,+) and (a^."). 
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Since we know that is a consistent, asymptotically 

unbiased and asymptotically Normally distributed estimator for Y, we 

A 

chose the value of a and model corresponding to Y^^g as our initial guess 
in ZXLSF. 



b. Simulation Results 

The maximum likelihood routine for estimating Y was tested 
in simulations using computer generated data from the BELAR( 1 ) process 
with known parameter values of %, p, X and a. By performing M 
independent simulations of sample size N (where N is increased for each 
set of M simulations) and fixed a, we were able to compare the standard 
deviation and bias (if any), of Y.,, „ to that of the initial least 
squares estimator Y. for which the asymptotic distribution is Normal. 

Lio 

Changes in the Normal plots for one set of M simulations for N small to 
a second set of M simulations for a larger N would give some indication 
of how fast is or is not converging to a Normal distribution. 

Both M and N were small in the simulations for two reasons. 

A 

Since the asymptotic distribution of Y, „ was known, it was of more 

Li o 

A 

interest to see how much better Y.,, „ was for the smaller samples (i.e., 

MLE 

was the bias smaller for or was it, in fact, unbiased). Secondly, 

the run times for calculating (III. D. 2. 12) for N < 200 was long. The 
evaluation per sample of size N = 25 ranged from 100-300 secs. For 
N = 175, the run times ranged fran 700-950 secs. 

Figures III. E. 6.1, III. E. 6. 2 and III. E. 6. 3 are the Normal 
plots of twenty realizations of the maximum likelihood estimator of 
serial correlation and the least squares estimator of serial correlation 
for simulated data from the BELAR(I) process for selected values of a 
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Figure III.E.6.1. Normal Probability Plots of the Maximum Likelihood and the 

Least Squares Estimators of y BELAR(l) Process for 

20 Samples of Sizes 25 and 125 with a=.ll and y=. 19216 
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Figure III. E, 6. 3. Normal Probability Plots of the Maximum Likelihood and the 

Least Squares Estimators of y the BELAR(l) Process for 
20 Samples of Sizes 10 and 250 with a=.844 and y=-. 89986 



and for two subsample sizes, SSN. The layout provides for two-way 

comparisons. That is, from smaller SSN can be compared to for 

larger SSN. Likewise, for a given SSN, can be compared to 

which is known to have an asymptotic Normal distribution. The straight 

line in the Normal plots corresponds to a Normal distribution. The 

curved lines correspond to the Kolmogorov-Smirnof f bounds calculated 

from the data at the 95?t confidence level by the routine in the IBM 

experimental APL routine called GRAFSTAT. 

It appears from these figures that for the larger values of 

SSN, Y... c. Y, ^ fit Normal distributions better than the corresponding 
MLh Lo 

samples from the smaller values of SSN. It also appears that Y,., „ fits 

MLE 

a Normal distribution as well as the Y, „ for the larger values of SSN. 

LiO 

This supports the notion that the maximum likelihood estimator is 
converging to a Normal distribution. 

Figures 1 1 1 . E . 6 . , III. E. 6. 5 and III. E. 6. 6 are the 

corresponding scatter plot analyses for the data in the previous figures 

A ^ 

for the larger value of SSN. It appears that Y,,, „ and Y, _ have a 

ML b Lo 

positive correlation coefficient which becomes more pronounced as the 

data becomes less correlated. The distribution of Y„, „ also appears to 

MLE 

have a smaller variance than Y,_. This effect is more pronounced for 

Lo 

more highly correlated data. Compare, for example, the estimated 

standard deviation of Y,„ „ and that of Y, „ from the table in Figure 

MLE LS 

III.E.6.M with the corresponding entries in the table from Figure 
III.E.6.6. 
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SELECTION :ALL 

X LABEL rMAXIMUM LIKELIHOOD ESTIMATOR 
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Figure III. E. 6. 4. Scatter Plot Analysis of the Maximum Likelihood and the Least Squares Estimators 

of Y ip the BELAR(l) Process for 20 Samples of Size 125 with a=.ll and 1^216 
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Figure III. E. 6. 5. Scatter Plot Analysis of the Maximum Likelihood and the Least Squares Estimators 

of Y in the BELAR(l) Process for 20 Samples of Size 175 with a=.5 and y=-63662 
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Figure III. E. 6. 6. Scatter Plot Analysis of the Maximum Likelihood and the Least Squares Estimators 

of Y the BELAR(l) Process for 20 Samples of Size 250 with a=.844 and y=~*89986 



F. H-LAPLACE MOVING AVERAGE MODELS 



1 . Introduction 

In this section, we derive a time series model that has an 

H-Laplace marginal distribution and the correlation structure of a 
t h 

linear q -order moving average model. This construction uses the 
square root Beta-Laplace transform given in Section III.B. 3 . The first- 
order model retains the full range of correlations up to 1/2. 

2. The First-Order Moving Average Model 

Let {L (il-a)} be an i.i.d. sequence of ( H-a) -Laplace random 
n 

1 /2 

variables; { A^ (a,8,-2a)} be an i.i.d. sequence, independent of 
{L^(J,-a)}, where A^ is a Beta (a,i-2a) random variable and 0 < a < %/2. 
Both the innovation and the coefficient sequences are independent of 
^n-1 ’ ^n-2 ’ *** * sequence {X^()U given by 

X (H) = L (Jl-a) + A^^^(a,il-2a)L ,U-a), (III. F. 2.1) 

n n n n-1 

has a marginal J, -Laplace distribution and an MA(1) correlation structure 

such that 0 < Corr(X ,X ,) < 1/2. 

n ’ n-1 

To see that has an 2,-Laplace distribution, first note that 

by the square root Beta-Laplace transform theorem of Section III.B.3, 

1 /2 

the distribution of the product A (a,il-2a)L Ai-a) is a-Laplace. 
Then note that is the sum of two independent random variables, one 

of which has an ( J, -a) -Laplace distribution and the other has an a- 
Laplace distribution. So, if (JiyCo)) is the characteristic function of 

A 

X^(H), then 
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4i^(w) = 



(III. F. 2. 2) 




i-a 





To see that {X (i)} has the correct correlation structure, first 
n 

note that by the construction of (III. F. 2.1), X^_j^ is explicitly 
independent of for |k| S 2. Therefore, Corr (X^ ,X^_j^ ) is zero for 
|k| > 2. 

For k = ±1 , we have, after some simplification 

af(a+-^) r( 2.+1 -a) 

Corr(X ,X ) = T- . (III. F. 2. 3) 

" «,r(a+1)r(Jl-a+^) 

Finally, note that in the limit as a ^ 0, (III. F. 2. 3) approaches 

zero. Also, as a ^ 1/2, (III. F. 2. 3) approaches 1/2. 

Replace A^ ^^( a, Z,-2a) in (i|.1) by -aJ^^^(o, i-2a) , we have a full 

range (-1/2,0) of nonpositive lag-1 serial correlations. 

3. The q-Order Moving Average Model 

The MA(q) model for q ^ 2 is the extension of the MA(1) model 

given in (III.F.2.1). Let {L^(il-qa)} be an i.i.d. sequence of (2,-qa)- 

1 /2 

Laplace random variables. Let [A . {a, t-(q+1 )a} ] for i = 1,...,q be 

n , 1 

i.i.d. sequences, independent of each other and of {L^(2,-qa)}, where 

A . is a Beta { a , 2, - ( q + 1 ) a } random variable for all n and all 
n , 1 

i = 1,...,q. Also, 0 < a < )l/(q + 1 ) . Both the innovation and each of 

the coefficient sequences are independent of Then the 

sequence {X^(Z)} given by 
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(III.F.3. 1 ) 



Q 1 /p 

X (H) = L (H-qa) + I Ha, Z-(q+1 )a}L . (H-qa) , 
n n n-i 

has a marginal l-Laplace distribution and an MA(q) correlation structure 

for 0 < a < t/(q + 1). When a = 0, then is an i.i.d. sequence; 

when a = Jl/(q+1), then the moving average is an equal weighted average 

of q+1 i.i.d. a-Laplace error terms L^(a). 

To see that is an t-Laplace random variable, first note 

from the square root Beta-Laplace transformation theorem of Section 

1 /2 

III.B.3, that each product A .{a, Jl-(q+1)a}L _.(J,-qa) has an a-Laplace 

n f X ri“ X 

distribution . 

But the sum of q i.i.d. a-Laplace random variables has a qa- 
Laplace distribution. Thus, X^(Sl) is the sum of two independent random 
variables and its characteristic function is obtained as the product 



w) 




I 




i=1 




a 



_J It-qa f_1 Iqa _ f_J ]?, 

1+0)=") (l+io='J ■ * 



(III.F.3. 2) 



The correlations are truncated at lags |k| i q + 1 • By the 

construction of (III. F. 3.1). X is explicitly independent of X , for 

n ^ •' n-k 

|k| > q+1. 

Negative correlations are obtainable with 2^ choices for 

1/2 1/2 

replacing or not replacing [A . { a, !l-(q + 1 ) a} ] by [-A . la, 2.=(q+1 )a} ] in 

n I X n I X 

(III.F.3. 1) . 
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This model can be generalized from the one-parameter case by 



replacing qa in (III. F. 3*1) with a. in each term in the sum, 

q 

replacing L ( 2,-qa) by L (i-q E a.). 

i = 1 



and 
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IV. RESIDUAL ANALYSIS COMPARISON OF THE NLAR(1) 

AND THE BELAR(I) PROCESSES 

A. INTRODUCTION 

Lawrance and Lewis [Ref. 22] developed a higher-order residual 

analysis for non-linear time series with autoregressive correlation 

structures. Specifically, they developed a third-order analysis based 

on the cross-correlation of the linear residual, R^, and its square at 

lag k, They applied the analysis to the problem of modelling wind 

speed data. It is important to note that this analysis was done in 

conjunction with, and not in place of, the usual second-order analysis. 

As has been already pointed out, second-order analysis is sufficient for 

modelling only when the processes are both linear and Normal. 

The residual analysis involves only joint moments of order three. 

In Chapter II of this thesis, it was shown that for the NLAR(p) models 

with p = 1,2, all the third-order moments-- that is, those of the form 

E(X.X.X, ) for all i, j, k — are zero. Therefore, the Lawrance and Lewis 

residual analysis will not differentiate between the NLAR(p) processes 

with the same autocorrelation structure. It can also be shown by 

induction on k that Corr(R ,R^ ,) = Corr(X^,R , ) = 0 in the BELAR(l) 

n’ n-k n’ n-k 

process. Hence, either third-order residual analysis will be unable to 
discriminate the BELAR(l) process from any of the NLAR(l) processes with 
the same autocorrelation function. 

In the spirit of looking at the lowest possible joint moments for 
differentiating between models with symmetric marginals, a fourth-order 
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analysis is proposed. Two candidates are investigated as the basis of 
this analysis. The first one considered is the cross-correlation of 

n 

and the linear residual at lag k, R , . The second is the 
autocorrelation of R^ and The two analyses are compared by their 

abilities to differentiate among the different types of NLAR(1) 
processes and the BELAR( 1 ) process. 

Table IV. A. 1. contains a summary of the models in the comparison, 

along with the selected sets of parameter values and corresponding 

correlation coefficient. Even though each of the models has the same 

marginal distribution (standard Laplace) and identical autocorrelation 

functions, each has a theoretical cross-correlation function in terms of 

(X*,R , ) and autocorrelation function for (R^,R^ , ) that are different, 

n n-k n n-k 

The question of how the residual analysis is affected by parameter 
estimation is an important issue, but is not addressed at this time. 

Before the candidates are developed in the next two sections, it is 
convenient now to place both the NLAR(1) and BELAR( 1 ) processes into 
their common RCA(1) framework. 

Using the terminology of Nicholls and Quinn [Ref. 16], both the 
NLAR( 1 ) and the BELAR(I) processes can be written as 



X =cX +BX +e, 
n n-1 n n-1 n 



(IV.A.1) 



where {e } is the i.i.d. innovation, E(e ) = 0, and otherwise defined as 
n n 
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1. ( 1 -a) -Laplace in the BELAR( 1 ) process; 

2. standard Laplace, but with a degenerate component at the origin in 
the LAR(1) process; 

3. scaled Laplace where A = / 1 -a^ in the TLAR(1 ) process; 

4. convex mixture of scaled Laplace variables in the general non- 
boundary NLARd ) process. 



TABLE IV. A. 1 

Summary of Models with Laplace Marginals and Autocorrelations of Y 



Model 




Parameter Values 


Y 


Comments 






= 1 ; = .19216 


.19216 


Linear models; 


LAR(I) 




= 1 ; = -.63662 


- .63662 


one boundary of 




“l 


= 1 ; = .89986 


.89986 


NLARd ) family. 






= = .43836 


.19216 


General discrete 


NLAR( 1 ) 


“l 


= .797885; B^ = 


-.63662 


random coefficient 






= B^ = .94861 


.89986 


model . 




a 


= .11; positive model 


. 1921 6 


General continuous 


BELAR(I) 


a 


= .50; negative model 


-.63662 


random coefficient 




a 


= .884; positive model 


.89986 


model . 






= .19216; 6^ = 1 


.19216 


Other boundary 


TLAR(I) 




= .63662; B^ = -1 


-.63662 


model of NLARd ) . 






= .89986; B^ = 1 


.89986 




The 




process is the i.i.d. 


random 


coefficient process 



independent of { e } and {X, } for k ^ n-1 with E(B ) = 0 and otherwise 

n k n 



defined as: 
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1. ±( ( a , 1 -a) - Y), where Y = E{ A^^^(a, 1-a) 1 and A^(a,l-a) is a 

standard Beta random variable in the BELAR( 1 ) process; 

2. 0 in the LAR(1) process, since it is a linear, constant 
coefficient AR(1) process; 

3. {K^-a} in the other NLAR(1) processes, where is a Bernoulli 

random variable such that P(K' = 1) = a, and 0 ^ I 6, I and a. 

and 6^ are not both unity. At = ± 1 the process is the TLAR(l) 
process . 

The c is a constant defined as: 

1. Y = E{ A^'^^Ca, 1-a) } in the BELAR( 1 ) process; 

2. = B^E(K^) in all the NLAR(1) processes (a, = 1 being the 
LAR(1) process). 

The second and fourth moments of E and the second, third and fourth 

n 

moments of B are needed in the following sections. In Table IV. A. 2, 
n “ 

there is a convenient summary of the necessary equations. 

Now the linear residual, written in terms from (IV. A. 1) has the 

following forms analogous to (III.E.M.3) and ( III. E. <4. 4) , 

R = B X + e , (IV. A. 2) 

n n n-1 n’ 

R^ = - cX^ .. (IV. A. 3) 

n n n-1 

Using (IV. A. 2) and the independence of {B^} and the second and 

fourth moments of R are 

n 
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Various Moments for B and e in the RCA(1) Models 
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24(1-B?) 24[1-a,6?n+(l-a,)6?}J 12(l-a)(2-a) 24(1-, 



( IV. A. H) 



= 2E(B^) + ECe^") , 
n n n 

E(R“) = 24E(B-) + 12E(B^)E(e^) + E(e“). (IV. A. 5) 

n n n n n 

The variance of R* when needed is derived from {VJ.A.H) and IV. A. 5). 
n 

B. RESIDUAL ANALYSIS USING Corr(X' ,R , ) 

n n-k 

In this section, the residual analysis using the theoretical cross- 
correlations, Corr(X^, is developed. Using (IV.A.l) and (IV. A. 2), 

we have 

K = 1 ^ 3c^X^ .R^ + 3cX^ ,R^ + R^, (IV.B.l) 

n n-1 n-1 n n-1 n n 

where c is defined in Section IV. A. 

The cross-correlation function of X^ and R , at lag k is defined as 

n n-k 

E(X"R ) 

C (k) = Corr(X^,R ) = ^ > (IV. B. 2) 

31 n n-k ®x^°R 

n n-k 

where Var(X^) = E(X^) = 6! and Var(R , ) is given by (IV. A. for all n 

and all k, since as shown in Section III.E.3, is stationary 

whenever {X } is. 
n 

Now from the construction of R^ in (IV. A. 2), it is explicity clear 

that X and R , are dependent for all k and that the {R } are not 
n n-k ^ n 

independent of each other, unless B^ is identically zero as in linear 
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constant coefficient AR(1) processes. However, by the Residual Theorem 
(Lawrance and Lewis [Ref. 22]), the {R^} are uncorrelated. 

From (IV.B.1), we have for all k 



C,, (k) = {c^E(X^ R ) + 3c^E(X^ R R ) + 3cE(X R"R ) 
31 n-1 n-k n-i n n-k n-1 n n-k 



+ E(R^R )} / [12/ 5 {E(R^)} 
n n-k n 



1 / 2 - 



(IV. B. 3) 



Consider (IV. B. 3) for k < 0. Since the random coefficients {B } 

n 

are independent of the process {X.} for j ^ n-1, this implies that 

d 

C 2 ^(k) is zero for k < 0. For k = 0 in (IV.B.3), we have, after some 
simplification. 



C 



31 



(0) 



72c^E(B* ) + 6c^E(e^ ) + 72cE(B*) + E(R“) 
n n n n 

12/~ {E(R")}^''^ 
n 



(IV.B.lO 



For k S 1, there is the following recursive formula. 



C^^ik) = C2^(K-1){c^ + 3cE(B^ ) + E(B^)} + 



c (1 -c^)E(£^) 
n 

1/2 



. (IV. B. 5) 



2/ 5 {E(R2)} 



It is now a simple matter to consolidate the expressions for C 2 ^(k) 
for all k and substitute the appropriate expressions from Table IV. A. 2. 
For the NLAR(1) models — including LAR( 1 ) , for which = 1 , and TLAR(1) 
for which = ±1 — we have 
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k < 0; 



C3,(k) - 



0 . 

{2 - + 6a^ 1-2a^ ) (1-a^ ) - a*B|'(8-l9a^ +1 2ap } 



a^B;C3,(k-l) . 



10 






. k = 0; 



(IV. B. 6) 



k > 1 



10 



For the BELAR(I) process, we have 



C 



31 



(k) 



0. 

(6 - 5Y^ - aYd 

3 /To 



k < 0; 

k = 0; (IV. B. 7) 



j( 1+20)03^ (k-1) 



Y^^d-g) d-Yd^""^ 

/To 



k > 1 



The theoretical cross correlation functions for each of the models 
and sets of parameters in Table IV. A. 1 are given in Figures IV.B.1 - 
IV. B. 3. Three points can be made. For the models with |p| small, such 
as in Figure IV.B.1, there is little difference between the cross- 
correlation functions of all four models. (Of course for p = 0, there 
is absolutely no difference, since all NLAR(1) models and the BELAR(I) 
model collapse into the unique i.i.d. case). A difference between the 
cross-correlation function of the boundary NLAR(1) models — LAR(1) and 
TLAR(1) — does become more apparent as |p| increases. But, there seems 
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THEORETICAL CROSS-CORRELATION OF AND R 

LAR(1): a, = 1 B,=p p=.19216 NUVR(1); a,=B,=p-* p=.1 
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Figure IV.B.l. Theoretical Cross-Correlation Functions of and 
RCA{1) Processes with p(l)=. 19216 
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THEORETICAL CROSS-CORRELATION OF AND R 
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Figure IV. B. 3. Theoretical Cross-Correlation Functions of and 
RCA(l) Processes with p(l)=. 89986 



to be little distinction between the cross-correlation functions of X* 

n 

and from the BELAR( 1 ) process and the non-boundary NLAR( 1 ) process 

with = /fpT even when |p| is large as in Figure IV. B. 3. This 

final point suggests the possibility that there exists a pair of values, 
(a^,6^), whose product is p 0, for which the BELAR(I) process and the 
corresponding NLAR(1) process will not only have identical 
autocorrelation functions, but may also have nearly identical cross- 
correlations of and R , for some specified number lags 

k » 0, 1 , . . . , j . 

The cr OSS - cor r el at i ons of X^ and R , can also be used to 

n n-k 

distinguish the random coefficient AR(1) processes with a standard 
Laplace marginal distribution from the Gaussian AR(1) process where 
X^ - N(0,2) and - N { 0 , 2 ( 1 - a ^ ) } , where a is the correlation 

coefficient. We have for the Gaussian AR(1) models. 



C3,(k) . Corr(X^.R„,^) - 



0 k < -1, 

(3/5(1-a^)^^^ k = 0, 

a'^C^^ (0) k > 1 . 



( IV. B. 8) 



Note that is C^^(k) for k ^ 1 is proportional to Corr (X^ ) = a . 

This is consistent with the fact that a Gaussian process is completely 
determined by the mean and covariance matrix. 

Figures IV. B. 4 - IV. B. 6 show the theoretical cross-correlation 
function of the Gaussian AR(1) model superimposed on the values for the 
different models from Figures IV.B.1 - IV. B. 3, which have the standard 
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RESIDUAL ANALYSIS COMPARISIONS USING CORR(X^^,R^_,^ 

p = .19216 




M OV1 IV N0li\n3dd00-SS0d0 
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Figure IV. B. 4. Residual Analysis Comparisions Using Corr (X^ , for the 
Gaussian AR(1) Process and the 4 RCA(l) Processes with 
Var(X )=2, E(X )=0, and p(l)=. 19216 



RESIDUAL ANALYSIS COMPARISIONS USING CORR(X^^,R^_,, 

p = —.63662 
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RESIDUAL ANALYSIS COMPARISIONS USING CORR(X^^.R^_,, 

p = .89986 




4 -» 

U 

0 



+j 



o 

5 



cn c 

X 

u 

u 

8 

cr» 

. a 

•H 

W 

Ifi 

C 

o 

•H 

W 

•H 

U 

03 

O 

U 

03 
•H 
0) 
>1 
I— ! 

rC 



03 

D 

TJ 

•H 

03 



< 



<D 

4-> 

n 

C 

03 

03 

03 

03 

O 

O 

Ch 



0^ 

< 

03 

•H 

03 

03 

D 

03 

O 



CQ 

> 

H 

(D 

U 

d 

cn 

•H 

pL| 



'X) 

00 

0*^ 

on 

00 



'O 

c: 

03 



o 

II 



X 

w 



CN 

II 



X 

u 

03 

> 



234 



Laplace marginal distribution. There is some differentiation between 



the Laplace models with AR( 1 ) correlation structure and the given 

Gaussian AR(1) model, but not much. It would, however, be very easy to 

identify the Gaussian model from the Laplace models using probability 

plots. This illustrates the point made at the beginning of this 

chapter, that a higher-order residual analysis is not intended to 

replace the existing methods of analysis. It also emphasizes one of the 

very foundations of the thesis, that the marginal distribution is one of 

the very first aspects of a time series that should be examined. 

C. RESIDUAL ANALYSIS USING Corr(R^,R^ , ) 

n n-k 

In this section, the residual analysis using the theoretical 
autocorrelations, Corr (R^ , R^_l^) is developed. 

Let C„„(k) represent Corr(R^,R^ ,) for all k. Since the correlation 
function is an even function and {R^} is stationary, C^^ik) = . 

We represent only k = 0,1,2,... . Using (IV. A. 2), we have after some 
simplification for k ^ 1, 




n-k 
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(IV.C.1) 



= E(B^) Cov(X 
n 




) / Var(R^) . 



Now an immediate advantage to the analysis based on (IV.C.1) as 
opposed to that based on Corr(X®, apparent. For the constant 

coefficient models, LAR( 1 ) , will have a spike at lag-0 and be 

zero for all other lags, since B =0. This will not be the case for 
the other NLAR(1) random coefficient processes or in the BELAR(I) 
process. It will not, however, distinguish the LAR(1 ) process from any 
linear AR( 1 ) process. This, however, can be achieved using probability 
plots as mentioned previously. 

To derive a computational formula from (IV.C.1.) in terms of the 

parameters of the process, first let E^^(k) = E(X^R^ ,). Then, 

n n~k 

substituting in (IV. A. 1) and (IV. A. 2), we have, after some 
simplification for k = 0, 



Again using the stationarity of ( ) ar^d we have the following 

expression for the autocorrelation function 




= E(e‘*) + 2cE(e^) + 12E(e^)E(B^) +24 c^E(B 2) 
n n n n n 

+ 48cE(B^) + 2ilE(B‘*) . 



n n 



(IV.C.2) 



For k ^ 1, we have the recursion 




( IV.C.3) 
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k = 0; 



C^2^k) = 



1 , 

E(B") 

MariR^) 

n 



2E(R^)}, 



k > 1 



iiv.c.n) 



For the non-LAR(l) cases of the NLAR(1) process, we substitute from 
Table IV. A. 2 and equations (IV.A.^I) and (IV. A. 5) to obtain 



E22(k) 



4{6 - (5+1 26^-1 1a^ Bp } , 

“1^1^22^*^”^^ + ^( 1-app ( 1-app , 



k = 0; 

k > 1 . (IV. C. 5) 



0^2 (k) 



.1 , k = 0; 

.ap 1-ap BpE22(k-1 ) - 4(1-app} ^ ^ ^ 

4{5-app4 + 24 b^ - ^12apJ + 19app 



(IV.C.6) 



The corresponding results for the BELAR(I) process are 



E22(k) 



1 2(2+aY^-3Y^) , 
aE 22 (k- 1 ) + i*(1-a)(1-y^) , 



k = 0; 

k > 1 . ( IV. C. 7) 



C22(k) 



1 . 

(a-y"){E22(k-1) - M1-Y")} 
l|(5-12Y^+26aY^-19T‘*) ’ 



k = 0; 
k > 1 . 



( IV. C .8) 



The theoretical autocorrelation functions for each of the models and 
sets of parameters in Table IV. A. 1 are given in Figures IV. C. 1-3. There 
appears to be more discrimination between the TLAR(1) model and the 
other random coefficient models with Corr(R*,R^ , ) than with 
Corr(xp ^n-k^’ when |p| is small, as seen in comparing Figures 
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THEORETICAL AUTOCORRELATION OF R^^ r 
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RCA(l) Processes with p(l)=. 19216 



THEORETICAL AUTOCORRELATION OF R^^ and r 
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THEORETICAL AUTOCORRELATION OF R/ AND 

NLAR(1): a,=Bi=p^ p = .89986 BELAR(I): o=.844 p=.89986 



I L- 

^0*0 

X 



1 ' 

WO 

X 






ZO'O 0 

0^ IV N0tl^3dd000inv 




o 

5 



X 0\n IV N0LLV13da000inv 



§ 



A» 

u 

o 

4-1 

I 

a: 

c 

CN^C 

4-1 

o 

w 

c: 

o 

•H 

+J 

u 

c 

o 

&4 

c 

o 

-p 

fd 

r-| 

a; 

p 

u 

o 

o 

o 



rd 

O 

•H 

■P 

(D 

P 

0 

(U 

X 

Eh 



U 

O 



m 

U 

> 



‘ ‘ ' L 

ZO’O 0 

9\n IV Nouvi3ayoooinv 



0) 

u 

D 

U» 

•H 

pL| 



240 



RCA(l) Processes with p(l)=. 89986 



IV.C.1 and IV.B.1. There is still little discrimination between the 



BELAR(l) model and the given non-boundary NLAR(1) model. However, the 

important point is that since the LAR(1) model is a linear AR( 1 ) model, 

Corr(R^,R^ , ) =0 for all values of p and for all k = ±1 , ±2 

n n-k 
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V. EXTENSIONS AND OPEN PROBLEMS 



A. INTRODUCTION 

During the discussion in the previous chapters, possible extensions 
and/or unresolved issues have been mentioned. At this point, we con- 
clude by summarizing some of the directions in which this research could 
be continued. There are still significant contributions to be made, 
particularly in parameter estimation, model development and 
applications . 

B. ESTIMATION 

There are several open questions and extensions in the area of 
parameter estimation and inference for this class of stochastic 
processes . 

First, there is the need to obtain theoretical results substantiat- 
ing the empirical results from the simulation of the maximum likelihood 
estimator (m.l.e.) of serial correlation in the TLAR(1) and the BELAR(I) 
processes. Several researchers have written on the subject of maximum 
likelihood estimation in dependent sequences. Much of this is assembled 
in the books by Basawa and Prakasa Rao [Ref. M2] and by Basawa and Scott 
[Ref. M3]. It is not known whether the conditions on the conditional 
densities are satisfied in the cases of these random coefficient AR(1) 
models to prove that the m.l.e. is consistent, asymptotically efficient 
or asymptotically Normal. Conditions for the existence of the m.l.e's 
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are generally extremely complicated and difficult to verify unless the 
log likelihood is absolutely continuous in the parameter space. 

A second problem to resolve is that of existence and uniqueness of 
the maximum likelihood estimators of (a^,6^) in the NLAR(1) process. In 
this case, the log likelihood is definitely not differentiable with 
respect to the parameter, 8^, nor is it clear that there is a unique 
maximum. It appears from contour plots of the log- 1 i kel ihood function 
over a grid of values in (a^,8^) coordinates that there is a unique 
local optimum within the region bounded byO<a^ < 1 and-1 < 6^ < 1 
for large enough samples of A non-linear optimization technique 

that uses only function values and not derivatives seems to be 
appropriate, since the log-likelihood function is not differentiable 
everywhere with respect to 8^. 

A third problem involves the l-Beta-Laplace AR( 1 ) model. Except for 
the case when I is assumed to be one (the BELAR(I) model), the 
likelihood function in (a,)l) has not been derived. This is primarily a 
numerical issue since neither the density of for non-integer values 
of I nor the conditional density of X given X , for any values of 
J, > 0 have closed- form expressions. 

A fourth issue in estimation is to extend the maximum likelihood 
approach to include the joint estimation of the scale parameter of the 
marginal distribution to that of the shape parameter and the serial 
correlation coefficient. There is no reason to assume that the marginal 
distribution should always be a standard Laplace or standard l-Laplace. 
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Finally, there is the issue of quantile estimation in the random 
coefficient models. Empirical results are given only for the BELAR(I) 
process for the distribution of the sample median. Theoretical results 
are related to mixing conditions. Based on a new mixing condition, 
which has been shown to be satisfied by linear AR(1) processes 
[Ref. 44], Gastwirth and Rubin derived the asymptotic Normal theory of 
quantile estimation for the linear LAR(1) process. The open question is 
whether the mixing condition of Gastwirth and Rubin is satisfied by any 
of the random coefficient models — NLAR( 1 ) , BELAR( 1 ) or Jl-Beta-Laplace 
AR(1) . 

C. MODEL DEVELOPMENT 

Advances in modelling can be made in developing scalar models with 
p-th order autocorrelation structure, as well as bivariate autoregres- 
sive models. 

An open question in the development of the NLARMA(p,q) family of 
models is the existence of the general class of models with p-th order 
autocorrelation structure — NLAR(p) for p S 3; specifically, it is to 
derive the distribution of the i.i.d. innovation sequence This is 

only known for the TLAR(p) subclass of a proposed NLAR(p) family. 

A similar question is open for p S 2 in the continuous random 
coefficient models with an 2,-Laplace marginal distribution. The actual 
structure of the model, as well as the distribution of the innovation is 
in question. 

There is also a need for multivariate time series in many fields of 
physical science. The NEAR(2) framework was used by Dewald and Lewis 
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[Ref. 24] to derive a bivariate exponential AR( 1 ) model. Such an 
extension is also possible with the NLAR(2) model. Just how one 
estimates the eight possible parameters in such a model is an open 
question . 

Related to the model development and parameter estimation is the 

need to identify particular models. Higher order residual analyses have 

been based on the linear residual R = X - a.X . - a^X Since the 

n n 1 n-1 2 n-2 

NLAR(2) model is only partially time reversible, it is possible that the 

reversed residual R = X - a.X - a„X , could be used in model 

n n 1 n+1 2 n+1 

identification as well. These were introduced by Lawrance and Lewis 

[Ref. 6, 45] but their use has not been explored in any context. 

There is also the question of the effect that estimating a^ and a^ 

from {X } will have on the sample autocorrelation of (R^,R^ , ) and the 

n n n-k 

cross-correlation of (X*,R , ) in the fourth-order residual analyses 

n n- k ■' 

proposed in Chapter IV. 

D. APPLICATIONS 

In Chapter I, several areas have been noted where the modelling is 
accomplished with heavy-tailed distributions, notably in voice and 
acoustics modelling, as well as in image coding. In these areas, the 
Laplace distribution and the symmetric Gamma distributions are widely 
used. There is the possibility that the 2, -Laplace for I < 1 could also 
be a useful alternative to the symmetric Gamma. One advantage of the 
2,-Laplace distribution, which is the difference of two i.i.d. Gamma(l!,,\) 
is the simplicity of the form of the characteristic function. 
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Another field in which the i-Laplace models could be useful is in 
the modelling of the directional components of wind speed. Models with 
skewed marginal distributions have been fitted to data and then 
transformed either to Normals (for example by Brown, Katz and Murphy 
[Ref. 46]), or to Exponentials by Lawrance and Lewis [Ref. 6]. In both 
of the cited papers, the data indicated that the wind was almost always 
blowing. The question is, however, how does one model wind velocity 
when there are long calm periods. This is a problem from Australia as 
related by T. Lewis in the discussion of the NEAR(2) model [Ref. 6]. As 
can be seen in Figure III.C.1, for small values of i, highly correlated 
periods of calm and wind can be generated using the l-Beta-Laplace AR(1) 
model . 

The preceding examples demonstrate the opportunities for continued 
research and are not intended to narrow the focus of future endeavors. 
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VI. SUMMARY AND CONCLUSIONS 



We have indicated by reference to the scientific literature that there 
are important application areas, especially in the physical sciences of time 
series whose marginal distributions are non-Normal. This feature, itself, 
presents new problems in the modelling, study and analysis. 

For those areas where the non-Normality manifests itself primarily in 
the thickness (heaviness) of the tails of the marginal distributions, we 
have demonstrated that within the t-Laplace family of distributions, there 
is an appropriate member with which to model phenomenon with a symmetric 
heavy-tailed marginal distribution. The i-Laplace family has very thick 
tails when 2, is small and a limiting Normal distribution as I increases. 

To account for serial dependence in the time series we have derived two 
families of random processes that extends the random coefficient approach to 
modelling non-Normal time series. The discrete random coefficient models 
(NLARMA(p ,q ) ) have a Laplace marginal distribution and the continuous random 
coefficient models ( 2,-Beta-Laplace AR( 1 ) and MA(q)) have an 2-Laplace 
marginal distribution. Both families are additive models and imitate the 
linear Gaussian models in that they exhibit the usual ARMA(p,q) correlation 
structure. The models are parametrically parsimonious, structurally simple 
and easy to generate on a computer. 

We have also demonstrated that the fourth-order residual analyses based 
on the uncorrelated, but dependent sequence {R^} are appropriate and useful 
methods to discriminate between the discrete random coefficient and the 
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continuous random coefficient models when first, second and third-order 
properties are identical. 

For the purposes of parameter estimation, we derived the joint 
probability density function. Numerical routines were written to maximize 
the likelihood function to estimate the serial correlation coefficient in 
the BELARd) and the TLAR(1) processes. Simulation results indicated that 
this estimator was more efficient and less biased than the least squares 
estimator derived from the linear residual. 

Finally, we summarized some of the remaining issues in this field of 
non-Normal time series analysis. Extensions of the analyses in this thesis 
which need to be pursued are noted, along with possible applications in 
those previously mentioned fields of the physical sciences. 
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